[Bioperl-l] Recommended way to download qual files from Genbank?
Cook, Malcolm
MEC at stowers-institute.org
Fri Jan 11 19:14:10 UTC 2008
Indeed eutil is capable of this
The following use of my ncbi_eutil (attached) script yeilds what you
want:
ncbi_eutil -search db=nucleotide term=AC207960 -fetch rettype=qual >
AC207960.qual
It depends on the version of NCBI_PowerScripting.pm , such as is
included in
Malcolm Cook
Database Applications Manager - Bioinformatics
Stowers Institute for Medical Research - Kansas City, Missouri
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
> Chris Fields
> Sent: Friday, January 11, 2008 11:10 AM
> To: Phillip San Miguel
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] Recommended way to download qual
> files from Genbank?
>
> I don't think this is possible with the current setup for
> Bio::DB::GenBank (which the script uses). We'll have to
> investigate whether it is possible to retrieve this data via
> NCBI's eutils; if so we can try adding it in. If you want
> you can submit this as an enhancement request via bugzilla
> for tracking:
>
> http://bugzilla.open-bio.org/
>
> chris
>
> On Jan 11, 2008, at 10:22 AM, Phillip San Miguel wrote:
>
> > No problem getting sequence from genbank via a myriad of methods.
> > But as the volume of non-finished sequence in genbank increases the
> > importance of also obtaining quality values for a given sequence
> > increases. Some records include quality values.
> >
> > I typically use bp_fetch.pl to grab a sequence from genbank:
> >
> > bp_fetch.pl -fmt fasta net::genbank:AC207960
> >
> > sends the fasta sequence to STDOUT. But that bp_fetch.pl wasn't
> > designed to pull down quals evidently:
> >
> > bp_fetch.pl -fmt qual net::genbank:AC207960
> >
> > gives:
> >
> > ------------- EXCEPTION: Bio::Root::Exception -------------
> > MSG: You must pass a Bio::Seq::Quality or a Bio::Seq::PrimaryQual
> > object to write_seq() as a parameter named "source"
> > STACK: Error::throw
> > STACK: Bio::Root::Root::throw /usr/local/perl_5.8/lib/site_perl/
> > 5.8.8/Bio/Root/Root.pm:359
> > STACK: Bio::SeqIO::qual::write_seq
> /usr/local/perl_5.8/lib/site_perl/
> > 5.8.8/Bio/SeqIO/qual.pm:205
> > STACK: /usr/local/perl/bin/bp_fetch.pl:313
> > -----------------------------------------------------------
> >
> > (running under bioperl 1.5.2)
> >
> > The quality values for this accession are in genbank as these URLs
> > demonstrate:
> >
> >
> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&id=154937460
> >
> >
> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&list_uids=15
> > 4937460&dopt=fasta
> >
> >
> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&list_uids=15
> > 4937460&dopt=qual
> >
> > What is the best way to pull down these qual values? They aren't
> > present in "GenBank(Full)" format. They are present in an ASN.1
> > format.
> >
> > Advice would be appreciated.
> >
> > --
> > Phillip
> > Purdue Genomics Core Facility
> >
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list