[Bioperl-l] Recommended way to download qual files from Genbank?
Chris Fields
cjfields at uiuc.edu
Fri Jan 11 17:09:40 UTC 2008
I don't think this is possible with the current setup for
Bio::DB::GenBank (which the script uses). We'll have to investigate
whether it is possible to retrieve this data via NCBI's eutils; if so
we can try adding it in. If you want you can submit this as an
enhancement request via bugzilla for tracking:
http://bugzilla.open-bio.org/
chris
On Jan 11, 2008, at 10:22 AM, Phillip San Miguel wrote:
> No problem getting sequence from genbank via a myriad of methods.
> But as the volume of non-finished sequence in genbank increases the
> importance of also obtaining quality values for a given sequence
> increases. Some records include quality values.
>
> I typically use bp_fetch.pl to grab a sequence from genbank:
>
> bp_fetch.pl -fmt fasta net::genbank:AC207960
>
> sends the fasta sequence to STDOUT. But that bp_fetch.pl wasn't
> designed to pull down quals evidently:
>
> bp_fetch.pl -fmt qual net::genbank:AC207960
>
> gives:
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: You must pass a Bio::Seq::Quality or a Bio::Seq::PrimaryQual
> object to write_seq() as a parameter named "source"
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /usr/local/perl_5.8/lib/site_perl/
> 5.8.8/Bio/Root/Root.pm:359
> STACK: Bio::SeqIO::qual::write_seq /usr/local/perl_5.8/lib/site_perl/
> 5.8.8/Bio/SeqIO/qual.pm:205
> STACK: /usr/local/perl/bin/bp_fetch.pl:313
> -----------------------------------------------------------
>
> (running under bioperl 1.5.2)
>
> The quality values for this accession are in genbank as these URLs
> demonstrate:
>
> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&id=154937460
>
> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&list_uids=154937460&dopt=fasta
>
> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&list_uids=154937460&dopt=qual
>
> What is the best way to pull down these qual values? They aren't
> present in "GenBank(Full)" format. They are present in an ASN.1
> format.
>
> Advice would be appreciated.
>
> --
> Phillip
> Purdue Genomics Core Facility
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign
More information about the Bioperl-l
mailing list