[Bioperl-l] Problem retrieving CDS by Acession #
Ryan Golhar
golharam at umdnj.edu
Thu Sep 7 17:16:46 UTC 2006
> -----Original Message-----
> From: Sean Davis [mailto:sdavis2 at mail.nih.gov]
> Sent: Thursday, September 07, 2006 11:49 AM
> To: golharam at umdnj.edu
> Cc: bioperl-l at lists.open-bio.org; 'bioperl-l'
> Subject: Re: [Bioperl-l] Problem retrieving CDS by Acession #
>
>
> On Thursday 07 September 2006 10:32, Ryan Golhar wrote:
> > > On Thursday 07 September 2006 01:09, Ryan Golhar wrote:
> > > > Hi,
> > > >
> > > > I'm using Bio::DB::GenBank::get_Seq_by_acc() passing in a valid
> > > > accession #, XM_547879.2, for instance.
> > > >
> > > > I get the message in return:
> > > >
> > > > -------------------- WARNING ---------------------
> > > > MSG: acc (gb|XM_547879.2) does not exist
> > > > ---------------------------------------------------
> > > >
> > > > If I go to NCBI, and enter the accession, the GenBank entry
> > >
> > > comes up.
> > >
> > > > At first I suspected it was the version number, but
> removing the
> > > > version number still causes the same error.
> > > >
> > > > Am I doing something wrong?
> > >
> > > from the Docs for Bio::DB::Genbank:
> > >
> > > $seq = $gb->get_Seq_by_acc('J00522'); # Accession Number
> > > $seq = $gb->get_Seq_by_version('J00522.1'); #
> Accession.version
> > > $seq = $gb->get_Seq_by_gi('405830'); # GI Number
> > >
> > > So, you might try using get_Seq_by_version(....). I
> didn't test it,
> > > but give that a shot.
> >
> > get_Seq_by_version() worked.
> >
> > That does not explain why get_Seq_by_acc does not work with the
> > primary part of the accession #.
>
> As an example of why this shouldn't work, doing a search in
> entrez (online
> version) will bring up the newest version of an accession if
> the version is
> not included. If one specifies the version, though, one gets
> that version,
> even if it is not the newest. So, asking get_Seq_by_acc()
> with a version and
> ignoring the version would potentially get you the wrong
> version for the
> accession.
>
> If you know that you want the most recent version, just strip
> the version
> information and use get_Seq_by_acc().
>
> Sean
>
Sorry, maybe I'm not being clear. Suppose I only had the accession #,
XM_547879. If I call get_Seq_by_acc('XM_547879'), it gives the warning
above. That shouldn't be because I'm giving a valid accession number.
I suspect something is wrong in the parsing of whatever NCBI is
returning.
More information about the Bioperl-l
mailing list