[Bioperl-l] Re: grouping sequences by DNA-binding domains -- elaboration

Stefan Kirov skirov at utk.edu
Tue Oct 18 15:33:36 EDT 2005


Actually Brian, Bio::SeqIO::entrezgene will extract this data from the 
ASN1 file:

use Bio::SeqIO;
my $eio=new Bio::SeqIO(-file=>$file,-format=>'entrezgene', 
-debug=>'off',-service_record=>'no');
($seq,$struct,$uncapt)=$eio->next_seq;
my @contigs=$struct->get_members();#(-authority=>'genomic');
foreach my $contig (@contigs) {
    if ($contig->authority eq 'Product') {
        foreach my $sf ($contig->get_SeqFeatures) {
            foreach my $dblink ($sf->annotation->get_Annotations(dblink)) {
                my 
$key=$dblink->{_anchor}?$dblink->{_anchor}:$dblink->optional_id;
                my $db=$dblink->database;
                next unless (($db =~/cdd/i)||($sf->primary_tag=~ 
/conserved/i));
                my $desc;
                if ($key =~ /:/) {
                    ($key,$desc)=split(/:/,$key);
                }
                print join($fs, 
$gid,$contig->id,$desc,$key,$sf->score,'','',$db,$sf->start,$sf->end),"\n";
            }
        }
    }
}

I guess it is really a good time time to write thise docs :-)
Stefan

Brian Osborne wrote:

>Olena,
>
>I'm pretty sure that there's no code in Bioperl that accesses or parses CDD,
>hopefully I'm corrected if I'm wrong.
>
>Brian O.
>
>
>On 10/18/05 2:26 PM, "Olena Morozova" <olenka.m at gmail.com> wrote:
>
>  
>
>>Hi Brian,
>>
>>Thank you for your reply. It is the CDD (Conserved Domain Database) on
>>the NCBI web site.
>>Olena
>>
>>On 10/18/05, Brian Osborne <brian_osborne at cognia.com> wrote:
>>    
>>
>>>Olena,
>>>
>>>What database contains the information you're looking for?
>>>
>>>Brian O.
>>>
>>>
>>>On 10/16/05 8:17 PM, "Olena Morozova" <olenka.m at gmail.com> wrote:
>>>
>>>      
>>>
>>>>Hi agian,
>>>>
>>>>I just figured out how to obtain a list of conserved domains for a
>>>>given sequence using the SeqHound.pm module available at
>>>>http://www.blueprint.org/seqhound/apifunctslist.html
>>>>
>>>>Now I have a list of conserved domains for a given sequence and I need
>>>>to extract information as to what these domains are and which ones are
>>>>DNA-binding. Any help on this will be greatly appreciated
>>>>
>>>>Thanks again,
>>>>Olena
>>>>
>>>>
>>>>On 10/16/05, Olena Morozova <olenka.m at gmail.com> wrote:
>>>>        
>>>>
>>>>>I have a list of transcription factor sequences, and I need to group
>>>>>them according to the DNA-binding domains based on the classification
>>>>>by TRANSFAC or any other database. Basically, I just need to extract
>>>>>the DNA-binding domain information for a particular TF from a database
>>>>>like TRANSFAC (I don't know what other databases would have this
>>>>>information, but any will do) Anyone has any idea how to do this?
>>>>>Thank you very much for your help and time
>>>>>
>>>>>Olena
>>>>>
>>>>>          
>>>>>
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l at portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>        
>>>>
>>>
>>>      
>>>
>
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>  
>

-- 
Stefan Kirov, Ph.D.
University of Tennessee/Oak Ridge National Laboratory
5700 bldg, PO BOX 2008 MS6164
Oak Ridge TN 37831-6164
USA
tel +865 576 5120
fax +865-576-5332
e-mail: skirov at utk.edu
sao at ornl.gov

"And the wars go on with brainwashed pride
For the love of God and our human rights
And all these things are swept aside"



More information about the Bioperl-l mailing list