[Bioperl-l] Recommended way to download qual files from Genbank?

Chris Fields cjfields at uiuc.edu
Fri Jan 11 17:09:40 UTC 2008


I don't think this is possible with the current setup for  
Bio::DB::GenBank (which the script uses).  We'll have to investigate  
whether it is possible to retrieve this data via NCBI's eutils; if so  
we can try adding it in.  If you want you can submit this as an  
enhancement request via bugzilla for tracking:

http://bugzilla.open-bio.org/

chris

On Jan 11, 2008, at 10:22 AM, Phillip San Miguel wrote:

> No problem getting sequence from genbank via a myriad of methods.  
> But as the volume of non-finished sequence in genbank increases the  
> importance of also obtaining quality values for a given sequence  
> increases. Some records include quality values.
>
> I typically use bp_fetch.pl to grab a sequence from genbank:
>
> bp_fetch.pl -fmt fasta net::genbank:AC207960
>
> sends the fasta sequence to STDOUT. But that bp_fetch.pl wasn't  
> designed to pull down quals evidently:
>
> bp_fetch.pl -fmt qual net::genbank:AC207960
>
> gives:
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: You must pass a Bio::Seq::Quality or a Bio::Seq::PrimaryQual  
> object to write_seq() as a parameter named "source"
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /usr/local/perl_5.8/lib/site_perl/ 
> 5.8.8/Bio/Root/Root.pm:359
> STACK: Bio::SeqIO::qual::write_seq /usr/local/perl_5.8/lib/site_perl/ 
> 5.8.8/Bio/SeqIO/qual.pm:205
> STACK: /usr/local/perl/bin/bp_fetch.pl:313
> -----------------------------------------------------------
>
> (running under bioperl 1.5.2)
>
> The quality values for this accession are in genbank as these URLs  
> demonstrate:
>
> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&id=154937460
>
> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&list_uids=154937460&dopt=fasta
>
> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&list_uids=154937460&dopt=qual
>
> What is the best way to pull down these qual values? They aren't  
> present in "GenBank(Full)" format. They are present in an ASN.1  
> format.
>
> Advice would be appreciated.
>
> -- 
> Phillip
> Purdue Genomics Core Facility
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign






More information about the Bioperl-l mailing list