[Bioperl-l] EMBL format field

Kevin Brown Kevin.M.Brown at asu.edu
Thu Jun 12 15:22:11 UTC 2008


See the following links for where to get a more current version.  1.4 is
years old and lots of parts are non-functional due to website and file
format changes.

http://www.bioperl.org/wiki/Installing_BioPerl

http://www.bioperl.org/wiki/Installing_BioPerl_on_Ubuntu_Server 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of 
> Zhi-Qiang Ye
> Sent: Thursday, June 12, 2008 2:07 AM
> To: Jason Stajich
> Cc: bioperl list
> Subject: Re: [Bioperl-l] EMBL format field
> 
> Hi, Jason
> 
>      I used exactly your code, and the result is still 'unknown id'.
> Where can I get the version of bioperl?
> I used ubuntu gutsy, the version in ubuntu's package 
> management system is 1.4-1.
> 
>      I installed BioPerl 1.4 on another computer, IA64 with redhat
> linux.  It has the same problem.
> In the process of installation using CPAN, make test always failed. So
> I used 'force install ....'.
> I am not sure it is the reason.
> 
> Thanks.
> Zhi-Qiang Ye
> 
> 2008/6/11 Jason Stajich <jason at bioperl.org>:
> > What version of bioperl? It works for me using  this code I 
> get 'CB271253'
> > printed out.
> >
> > #!/usr/bin/perl -w
> > use strict;
> > use Bio::SeqIO;
> > my $in = Bio::SeqIO->new(-format => 'embl', -file => shift);
> > while( my $seq = $in->next_seq ) {
> >  print $seq->id,"\n";
> > }
> >
> > On Jun 10, 2008, at 4:43 AM, Zhi-Qiang Ye wrote:
> >
> >> That's weird. I also met this problem. I tried a 
> embl-format file like
> >> this:
> >>
> >> ID   CB271253; SV 1; linear; mRNA; EST; INV; 591 BP.
> >> XX
> >> AC   CB271253;
> >> XX
> >> DT   24-FEB-2003 (Rel. 74, Created)
> >> DT   24-FEB-2003 (Rel. 74, Last updated, Version 1)
> >> XX
> >> DE   taa17c02.x2 Hydra EST -II Hydra magnipapillata cDNA 
> 3' similar to
> >> DE   SW:OPSD_RABIT P49912 RHODOPSIN. ;, mRNA sequence.
> >>
> >> from: 
> http://www.ebi.ac.uk/cgi-bin/dbfetch?db=embl&id=CB271253&style=raw
> >>
> >> the $seq object's   ->id, ->display_id  are "unkown id" ...
> >>
> >>
> >>
> >> ZQ Ye
> >>
> >> 2008/6/9 Hilmar Lapp <hlapp at gmx.net>:
> >>>
> >>> If this is the case with the latest version of BioPerl it 
> should be filed
> >>> as
> >>> a bug report for the embl parser. The ID ought to be reported in
> >>> $seq->get_secondary_accessions() (which returns an 
> array). If it doesn't,
> >>> it
> >>> sounds like a bug to me.
> >>>
> >>>       -hilmar
> >>>
> >>> On Jun 9, 2008, at 4:47 AM, Marc Logghe wrote:
> >>>>
> >>>> Hi Wen,
> >>>> A dump of that sequence object (Data::Dumper is your 
> friend !) reveals
> >>>> that the PA EMBL field is not saved into the object. 
> However, you will
> >>>> find the string 'AB000170.1' in the embedded CDS 
> feature, more precisely
> >>>> the seqid of the location object. I don't know whether 
> that is always
> >>>> the case, but it is in your particular example.
> >>>> So, to get your hands on that value you have to do:
> >>>>
> >>>> my ($cds) = grep {$_->primary_tag eq 'CDS'} 
> $seq->get_SeqFeatures;
> >>>> my $parent_id = $cds->location->seq_id;
> >>>>
> >>>> HTH,
> >>>> Marc
> >>>>
> >>>> Marc Logghe
> >>>> Senior Bioinformatician
> >>>> Ablynx nv
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >>>>> bounces at lists.open-bio.org] On Behalf Of Wen Huang
> >>>>> Sent: Monday, June 09, 2008 5:28 AM
> >>>>> To: bioperl-l at lists.open-bio.org
> >>>>> Subject: [Bioperl-l] EMBL format field
> >>>>>
> >>>>> Hi all,
> >>>>>
> >>>>> I have a EMBL file that I want to extract one of the line
> >>>>>
> >>>>> ###file###
> >>>>> ID   BAA19060; SV 1; linear; mRNA; STD; MAM; 2115 BP.
> >>>>> XX
> >>>>> PA   AB000170.1
> >>>>> XX
> >>>>> DE   Sus scrofa (pig) endopeptidase 24.16 type M1
> >>>>> XX
> >>>>> OS   Sus scrofa (pig)
> >>>>> OC   Eukaryota; Metazoa; Chordata; Craniata; 
> Vertebrata; Euteleostomi;
> >>>>> Mammalia;
> >>>>> OC   Eutheria; Laurasiatheria; Cetartiodactyla; Suina; 
> Suidae; Sus.
> >>>>> OX   NCBI_TaxID=9823;
> >>>>> .........
> >>>>>
> >>>>> I want the accession number in the line that starts 
> with PA, AB000170
> >>>>> in this example.
> >>>>>
> >>>>> Can anybody kindly help, tell me which module and 
> method I should use?
> >>>>> I tried various things like $seq_obj -> primary_id, display_id,
> >>>>> get_secondary_id, etc.. they did not work...
> >>>>>
> >>>>> Thanks a lot!
> >>>>>
> >>>>> Wen
> >>>>> _______________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 




More information about the Bioperl-l mailing list