[Bioperl-l] Recommended way to download qual files from Genbank?

Cook, Malcolm MEC at stowers-institute.org
Fri Jan 11 19:14:10 UTC 2008


Indeed eutil is capable of this

The following use of my ncbi_eutil (attached) script yeilds what you
want:

ncbi_eutil -search db=nucleotide term=AC207960 -fetch rettype=qual >
AC207960.qual

It depends on the version of NCBI_PowerScripting.pm , such as is
included in 

Malcolm Cook
Database Applications Manager - Bioinformatics
Stowers Institute for Medical Research - Kansas City, Missouri
  

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of 
> Chris Fields
> Sent: Friday, January 11, 2008 11:10 AM
> To: Phillip San Miguel
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] Recommended way to download qual 
> files from Genbank?
> 
> I don't think this is possible with the current setup for 
> Bio::DB::GenBank (which the script uses).  We'll have to 
> investigate whether it is possible to retrieve this data via 
> NCBI's eutils; if so we can try adding it in.  If you want 
> you can submit this as an enhancement request via bugzilla 
> for tracking:
> 
> http://bugzilla.open-bio.org/
> 
> chris
> 
> On Jan 11, 2008, at 10:22 AM, Phillip San Miguel wrote:
> 
> > No problem getting sequence from genbank via a myriad of methods.  
> > But as the volume of non-finished sequence in genbank increases the 
> > importance of also obtaining quality values for a given sequence 
> > increases. Some records include quality values.
> >
> > I typically use bp_fetch.pl to grab a sequence from genbank:
> >
> > bp_fetch.pl -fmt fasta net::genbank:AC207960
> >
> > sends the fasta sequence to STDOUT. But that bp_fetch.pl wasn't 
> > designed to pull down quals evidently:
> >
> > bp_fetch.pl -fmt qual net::genbank:AC207960
> >
> > gives:
> >
> > ------------- EXCEPTION: Bio::Root::Exception -------------
> > MSG: You must pass a Bio::Seq::Quality or a Bio::Seq::PrimaryQual 
> > object to write_seq() as a parameter named "source"
> > STACK: Error::throw
> > STACK: Bio::Root::Root::throw /usr/local/perl_5.8/lib/site_perl/
> > 5.8.8/Bio/Root/Root.pm:359
> > STACK: Bio::SeqIO::qual::write_seq 
> /usr/local/perl_5.8/lib/site_perl/
> > 5.8.8/Bio/SeqIO/qual.pm:205
> > STACK: /usr/local/perl/bin/bp_fetch.pl:313
> > -----------------------------------------------------------
> >
> > (running under bioperl 1.5.2)
> >
> > The quality values for this accession are in genbank as these URLs
> > demonstrate:
> >
> > 
> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&id=154937460
> >
> > 
> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&list_uids=15
> > 4937460&dopt=fasta
> >
> > 
> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&list_uids=15
> > 4937460&dopt=qual
> >
> > What is the best way to pull down these qual values? They aren't 
> > present in "GenBank(Full)" format. They are present in an ASN.1 
> > format.
> >
> > Advice would be appreciated.
> >
> > --
> > Phillip
> > Purdue Genomics Core Facility
> >
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 




More information about the Bioperl-l mailing list