[Bioperl-l] GO terms not present in Swiss annotation object

Chris Fields cjfields at uiuc.edu
Tue Nov 21 23:54:32 UTC 2006


You'll want to always reply to the list as well.  I would say update  
to a newer version; many changes have been made to parsing GenBank/ 
SwissProt/EMBL since rel 1.4, including dblinks.  If you're using  
windows you'll need to follow the instructions on the website for the  
latest release candidate:

http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows

Note that the release candidates are located in a different  
repository, so you'll need to set that up to find them.

chris

On Nov 21, 2006, at 2:32 PM, Juan Cristobal Vera wrote:

> ok, thanks for responding!
>   I'm using ActivePerl 5.8.8 build 819 on a windows machine (sorry)  
> and the bioperl 1.4 PPM3 package.  Perhaps this is too old?
> Here's part of my code (mostly derived from bioperl docs):
> .........................
> #cut
>
> $seqInObj = $indexObj->get_Seq_by_id($line);  #get sequence and  
> create seq object
>
> #cut
>
> if (defined $seqInObj->annotation){
>       $annotObj = $seqInObj->annotation;  #create annotation object
>       foreach $key ($annotObj->get_all_annotation_keys){
>         @values = $annotObj->get_Annotations($key);
>         foreach $value (@values){
>           if (lc($key) eq "dblink"){
>              print $outfh "Annotation: $key\n";
>              print $outfh $value->as_text,"\n";
>              $dbhash_ref = $value->hash_tree;
>              for $dbKey (keys %{$dbhash_ref}) {
>                print $outfh $dbKey,": ",$dbhash_ref-> 
> {$dbKey},"\n";    #none of these prints produce GO terms
>                }
>              }
>          }
>       }
> }
> .........................
> My program searches an indexed database on my machine, creates the  
> objects, and prints out relevant annotations.
> Here are some of the accessions I used for testing:
> P19351  TNNT_DROME
> P36188  TNNI_DROME
> P11147  HSP7D_DROME
> ..........................................
> the relevant output looks something like this (for debugging) for  
> P19351:
> ......................................................................
> Direct database link to X58188 in database EMBL
> database: EMBL
> comment:  -; Genomic_DNA.
> primary_id: X58188
> optional_id: CAA41171.1
> Annotation: dblink
> Direct database link to X59376 in database EMBL
> database: EMBL
> comment:  -; mRNA.
> primary_id: X59376
> optional_id: CAA42020.1
> Annotation: dblink
> Direct database link to AE003507 in database EMBL
> database: EMBL
> comment:  -; Genomic_DNA.
> primary_id: AE003507
> optional_id: AAF48802.2
> Annotation: dblink
> Direct database link to AE003507 in database EMBL
> database: EMBL
> comment:  -; Genomic_DNA.
> primary_id: AE003507
> optional_id: AAF48803.2
> Annotation: dblink
> Direct database link to AE003507 in database EMBL
> database: EMBL
> comment:  -; Genomic_DNA.
> primary_id: AE003507
> optional_id: AAF48804.2
> Annotation: dblink
> Direct database link to AE003507 in database EMBL
> database: EMBL
> comment:  -; Genomic_DNA.
> primary_id: AE003507
> optional_id: AAF48805.2
> Annotation: dblink
> Direct database link to AE003507 in database EMBL
> database: EMBL
> comment:  -; Genomic_DNA.
> primary_id: AE003507
> optional_id: AAN09458.1
> Annotation: dblink
> Direct database link to AY122145 in database EMBL
> database: EMBL
> comment:  -; mRNA.
> primary_id: AY122145
> optional_id: AAM52657.1
> Annotation: dblink
> Direct database link to A40547 in database PIR
> database: PIR
> primary_id: A40547
> optional_id: A40547
> Annotation: dblink
> Direct database link to B38594 in database PIR
> database: PIR
> primary_id: B38594
> optional_id: B38594
> Annotation: dblink
> Direct database link to Dm.1717 in database UniGene
> database: UniGene
> primary_id: Dm.1717
> optional_id: -
> Annotation: dblink
> Direct database link to P45379 in database HSSP
> database: HSSP
> primary_id: P45379
> optional_id: 1J1E
> Annotation: dblink
> Direct database link to P36188 in database IntAct
> database: IntAct
> primary_id: P36188
> optional_id: -
> Annotation: dblink
> Direct database link to dme:CG7178-PA in database KEGG
> database: KEGG
> primary_id: dme:CG7178-PA
> optional_id: -
> Annotation: dblink
> Direct database link to dme:CG7178-PB in database KEGG
> database: KEGG
> primary_id: dme:CG7178-PB
> optional_id: -
> Annotation: dblink
> Direct database link to dme:CG7178-PC in database KEGG
> database: KEGG
> primary_id: dme:CG7178-PC
> optional_id: -
> Annotation: dblink
> Direct database link to dme:CG7178-PD in database KEGG
> database: KEGG
> primary_id: dme:CG7178-PD
> optional_id: -
> Annotation: dblink
> Direct database link to dme:CG7178-PG in database KEGG
> database: KEGG
> primary_id: dme:CG7178-PG
> optional_id: -
> Annotation: dblink
> Direct database link to FBgn0004028 in database FlyBase
> database: FlyBase
> primary_id: FBgn0004028
> optional_id: wupA
> Annotation: dblink
> Direct database link to IPR001978 in database InterPro
> database: InterPro
> primary_id: IPR001978
> optional_id: Troponin
> Annotation: dblink
> Direct database link to PF00992 in database Pfam
> database: Pfam
> comment: 1
> primary_id: PF00992
> optional_id: Troponin
> ..............................................
> as you can see, no GO terms above
> ......................................................
> Vs. the actual content of the flat file from for the dblinks from  
> P19351:
> DR EMBL; X54504; CAA38366.1; -; mRNA.
> DR EMBL; AY439172; AAR24583.1; -; Genomic_DNA.
> DR EMBL; AY439172; AAR24584.1; -; Genomic_DNA.
> DR EMBL; AY439172; AAR24585.1; -; Genomic_DNA.
> DR EMBL; AY439172; AAR24586.1; -; Genomic_DNA.
> DR EMBL; AY439172; AAR24587.1; -; Genomic_DNA.
> DR EMBL; AY665838; AAU09446.1; -; mRNA.
> DR EMBL; AE014298; AAF48288.2; -; Genomic_DNA.
> DR EMBL; AE014298; AAF48289.2; -; Genomic_DNA.
> DR EMBL; AE014298; AAF48290.1; -; Genomic_DNA.
> DR EMBL; AE014298; AAX52491.1; -; Genomic_DNA.
> DR EMBL; AE014298; AAX52492.1; -; Genomic_DNA.
> DR EMBL; AE014298; AAX52493.1; -; Genomic_DNA.
> DR EMBL; AY051989; AAK93413.1; -; mRNA.
> DR EMBL; AY070875; AAL48497.1; ALT_SEQ; mRNA.
> DR PIR; S13251; S13251.
> DR UniGene; Dm.20472; -.
> DR HSSP; P45379; 1J1E.
> DR Ensembl; CG7107; Drosophila melanogaster.
> DR KEGG; dme:CG7107-PE; -.
> DR KEGG; dme:CG7107-PF; -.
> DR KEGG; dme:CG7107-PG; -.
> DR FlyBase; FBgn0004169; up.
> DR GO; GO:0007498; P:mesoderm development; IEP:FlyBase. ......
> where the GO term is last entry in dblink section above.
> Any help you could provide would be most welcome.  Let me know if  
> this is insufficient information or if you need a working script.
>
>
> On Tue, 21 Nov 2006 00:19:59 -0600 Chris Fields wrote:
> Juan, The DBLink objects should be generated. You'll need to give  
> us a bit more information to go on, though. We need an example  
> sequence, your local version of Bioperl, maybe a test script, etc.  
> This is the right forum for this, yes, if you are using BioPerl.  
> Chris On Nov 20, 2006, at 6:52 PM, Juan Cristobal Vera wrote: > > >  
> Hi, > I'm writing a simple application to extract various fields  
> from > swissprot objects and I can't access the GO terms found in >  
> "dblink" part of the swiss format flat files. I'm not a >  
> professional programmer and I can't figure out why this is >  
> occuring. All the other "dblink" keys are being >  
> generated as far as I can tell (e.g. embl, pfam, etc). The GO >  
> terms are just skipped over and it's driving me crazy. Not sure if  
> > this is a bug or a deliberate strategy I'm unfamiliar with. I >  
> apologize if this is not the correct forum to ask for this sort of  
> > help and would ask to be directed to the proper one. > > > > Juan  
> Cristobal Vera > > Graduate Student > > Department of Biology > >  
> Penn State University > > 208 Mueller Laboratory > > University  
> Park, PA 16802 > > (814)863-2957 > >  
> _______________________________________________ > Bioperl-l mailing  
> list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/ 
> mailman/listinfo/bioperl-l Christopher Fields Postdoctoral  
> Researcher Lab of Dr. Robert Switzer Dept of Biochemistry  
> University of Illinois Urbana-Champaign
>
> Juan Cristobal Vera
> Graduate Student
> Department of Biology
> Penn State University
> 208 Mueller Laboratory
> University Park, PA 16802
> (814)863-2957

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign






More information about the Bioperl-l mailing list