[Bioperl-l] Re: grouping sequences by DNA-binding domains -- elaboration

Tue Oct 18 15:33:36 EDT 2005

Actually Brian, Bio::SeqIO::entrezgene will extract this data from the 
ASN1 file:

use Bio::SeqIO;
my $eio=new Bio::SeqIO(-file=>$file,-format=>'entrezgene', 
-debug=>'off',-service_record=>'no');
($seq,$struct,$uncapt)=$eio->next_seq;
my @contigs=$struct->get_members();#(-authority=>'genomic');
foreach my $contig (@contigs) {
    if ($contig->authority eq 'Product') {
        foreach my $sf ($contig->get_SeqFeatures) {
            foreach my $dblink ($sf->annotation->get_Annotations(dblink)) {
                my 
$key=$dblink->{_anchor}?$dblink->{_anchor}:$dblink->optional_id;
                my $db=$dblink->database;
                next unless (($db =~/cdd/i)||($sf->primary_tag=~ 
/conserved/i));
                my $desc;
                if ($key =~ /:/) {
                    ($key,$desc)=split(/:/,$key);
                }
                print join($fs, 
$gid,$contig->id,$desc,$key,$sf->score,'','',$db,$sf->start,$sf->end),"\n";
            }
        }
    }
}

I guess it is really a good time time to write thise docs :-)
Stefan

Brian Osborne wrote:

>Olena,
>
>I'm pretty sure that there's no code in Bioperl that accesses or parses CDD,
>hopefully I'm corrected if I'm wrong.
>
>Brian O.
>
>
>On 10/18/05 2:26 PM, "Olena Morozova" <olenka.m at gmail.com> wrote:
>
>  
>
>>Hi Brian,
>>
>>Thank you for your reply. It is the CDD (Conserved Domain Database) on
>>the NCBI web site.
>>Olena
>>
>>On 10/18/05, Brian Osborne <brian_osborne at cognia.com> wrote:
>>    
>>
>>>Olena,
>>>
>>>What database contains the information you're looking for?
>>>
>>>Brian O.
>>>
>>>
>>>On 10/16/05 8:17 PM, "Olena Morozova" <olenka.m at gmail.com> wrote:
>>>
>>>      
>>>
>>>>Hi agian,
>>>>
>>>>I just figured out how to obtain a list of conserved domains for a
>>>>given sequence using the SeqHound.pm module available at
>>>>http://www.blueprint.org/seqhound/apifunctslist.html
>>>>
>>>>Now I have a list of conserved domains for a given sequence and I need
>>>>to extract information as to what these domains are and which ones are
>>>>DNA-binding. Any help on this will be greatly appreciated
>>>>
>>>>Thanks again,
>>>>Olena
>>>>
>>>>
>>>>On 10/16/05, Olena Morozova <olenka.m at gmail.com> wrote:
>>>>        
>>>>
>>>>>I have a list of transcription factor sequences, and I need to group
>>>>>them according to the DNA-binding domains based on the classification
>>>>>by TRANSFAC or any other database. Basically, I just need to extract
>>>>>the DNA-binding domain information for a particular TF from a database
>>>>>like TRANSFAC (I don't know what other databases would have this
>>>>>information, but any will do) Anyone has any idea how to do this?
>>>>>Thank you very much for your help and time
>>>>>
>>>>>Olena
>>>>>
>>>>>          
>>>>>
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l at portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>        
>>>>
>>>
>>>      
>>>
>
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>  
>

-- 
Stefan Kirov, Ph.D.
University of Tennessee/Oak Ridge National Laboratory
5700 bldg, PO BOX 2008 MS6164
Oak Ridge TN 37831-6164
USA
tel +865 576 5120
fax +865-576-5332
e-mail: skirov at utk.edu
sao at ornl.gov

"And the wars go on with brainwashed pride
For the love of God and our human rights
And all these things are swept aside"