[Bioperl-l] get_Stream_by_acc in DB/GenBank.pm
Mick Watson
michaelwatson@paradigm-therapeutics.co.uk
Wed, 17 Apr 2002 16:04:27 +0100
--------------E864AA9745D0E51214124300
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
I'm guessing that if I download the bioperl 1.0 GenBank.pm and use it to fix the bug in the 0.7.2 GenBank.pm
then, although hacky, everything will work nicely........ :-)
Thanks!
Mick
Jason Stajich wrote:
> Yes there is a bug in the older (0.7.2) version that was not parsing out
> the <pre> tags properly and only took the last sequence. In fact this bug
> affects all the GenBank retrieval because the bad logic in the <pre>
> parsing when there are multiple sequences retrieved for a single
> accession. This is because the Entrez queries are not asking specifically
> for accession or gi (cause you can't do that with their interface). Use
> Bio::DB::EMBL which uses the EBI dbfetch mechanism and is much more
> reliable, but that is only available in 1.0... Hmm that doesn't help you.
>
> You can have a local install of bioperl-1.0 in your dir and just
> point your PERL5LIB to this (or add a use lib '/home/me/bioperl-1.0'
> to the scripts that you want to use bioperl 1.0). Now if you are doing
> both Ensembl and Bioperl in the same script you're stuck.
>
> -j
>
> On Wed, 17 Apr 2002, Mick Watson wrote:
>
> > Very quick, simple question....
> >
> > Are there known bugs in &get_Stream_by_acc in DB/GenBank.pm for BioPerl
> > 0.7.2?
> >
> > Only if I use this method with two accession numbers, it tells me it is
> > using this url to fetch the sequences:
> >
> >
> > http://www.ncbi.nlm.nih.gov/entrez/utils/qmap.cgi?db=n&title=no&form=6&dopt=genbank&uid=BG295424,BI255773
> >
> > This url does indeed fetch the two sequences. However, from the method
> > call I only get one Bio::Seq object returned, and turning "-verbose" on
> > in GenBank.pm reveals that only one GenBank record is being downloaded!
> >
> > I suspect this is a bug which has been fixed, but i don't want to
> > upgrade Bioperl because I need this version for my local Ensembl to
> > work....
> >
> > Thanks
> > Mick
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
> >
>
> --
> Jason Stajich
> Duke University
> jason@cgt.mc.duke.edu
--------------E864AA9745D0E51214124300
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
I'm guessing that if I download the bioperl 1.0 GenBank.pm and use it to
fix the bug in the 0.7.2 GenBank.pm then, although hacky, everything will
work nicely........ :-)
<p>Thanks!
<br>Mick
<p>Jason Stajich wrote:
<blockquote TYPE=CITE>Yes there is a bug in the older (0.7.2) version that
was not parsing out
<br>the <pre> tags properly and only took the last sequence. In
fact this bug
<br>affects all the GenBank retrieval because the bad logic in the <pre>
<br>parsing when there are multiple sequences retrieved for a single
<br>accession. This is because the Entrez queries are not asking
specifically
<br>for accession or gi (cause you can't do that with their interface).
Use
<br>Bio::DB::EMBL which uses the EBI dbfetch mechanism and is much more
<br>reliable, but that is only available in 1.0... Hmm that doesn't help
you.
<p>You can have a local install of bioperl-1.0 in your dir and just
<br>point your PERL5LIB to this (or add a use lib '/home/me/bioperl-1.0'
<br>to the scripts that you want to use bioperl 1.0). Now if
you are doing
<br>both Ensembl and Bioperl in the same script you're stuck.
<p>-j
<p>On Wed, 17 Apr 2002, Mick Watson wrote:
<p>> Very quick, simple question....
<br>>
<br>> Are there known bugs in &get_Stream_by_acc in DB/GenBank.pm for
BioPerl
<br>> 0.7.2?
<br>>
<br>> Only if I use this method with two accession numbers, it tells me
it is
<br>> using this url to fetch the sequences:
<br>>
<br>>
<br>> <a href="http://www.ncbi.nlm.nih.gov/entrez/utils/qmap.cgi?db=n&title=no&form=6&dopt=genbank&uid=BG295424,BI255773">http://www.ncbi.nlm.nih.gov/entrez/utils/qmap.cgi?db=n&title=no&form=6&dopt=genbank&uid=BG295424,BI255773</a>
<br>>
<br>> This url does indeed fetch the two sequences. However, from
the method
<br>> call I only get one Bio::Seq object returned, and turning "-verbose"
on
<br>> in GenBank.pm reveals that only one GenBank record is being downloaded!
<br>>
<br>> I suspect this is a bug which has been fixed, but i don't want to
<br>> upgrade Bioperl because I need this version for my local Ensembl
to
<br>> work....
<br>>
<br>> Thanks
<br>> Mick
<br>>
<br>> _______________________________________________
<br>> Bioperl-l mailing list
<br>> Bioperl-l@bioperl.org
<br>> <a href="http://bioperl.org/mailman/listinfo/bioperl-l">http://bioperl.org/mailman/listinfo/bioperl-l</a>
<br>>
<p>--
<br>Jason Stajich
<br>Duke University
<br>jason@cgt.mc.duke.edu</blockquote>
</html>
--------------E864AA9745D0E51214124300--