[Bioperl-l] Recommended way to download qual files from Genbank?

Phillip San Miguel pmiguel at purdue.edu
Fri Jan 11 19:33:13 UTC 2008


Hi Malcolm,
    Looks like your email was (inadvertantly?) redacted in some way. (No 
attachment and last sentence truncated.) Would it be possible to get a 
complete version so I can be sure I'm following you?
Thanks,
Phillip

Cook, Malcolm wrote:
> Indeed eutil is capable of this
>
> The following use of my ncbi_eutil (attached) script yeilds what you
> want:
>
> ncbi_eutil -search db=nucleotide term=AC207960 -fetch rettype=qual >
> AC207960.qual
>
> It depends on the version of NCBI_PowerScripting.pm , such as is
> included in 
>
> Malcolm Cook
> Database Applications Manager - Bioinformatics
> Stowers Institute for Medical Research - Kansas City, Missouri
>   
>
>   
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org 
>> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of 
>> Chris Fields
>> Sent: Friday, January 11, 2008 11:10 AM
>> To: Phillip San Miguel
>> Cc: bioperl-l
>> Subject: Re: [Bioperl-l] Recommended way to download qual 
>> files from Genbank?
>>
>> I don't think this is possible with the current setup for 
>> Bio::DB::GenBank (which the script uses).  We'll have to 
>> investigate whether it is possible to retrieve this data via 
>> NCBI's eutils; if so we can try adding it in.  If you want 
>> you can submit this as an enhancement request via bugzilla 
>> for tracking:
>>
>> http://bugzilla.open-bio.org/
>>
>> chris
>>
>> On Jan 11, 2008, at 10:22 AM, Phillip San Miguel wrote:
>>
>>     
>>> No problem getting sequence from genbank via a myriad of methods.  
>>> But as the volume of non-finished sequence in genbank increases the 
>>> importance of also obtaining quality values for a given sequence 
>>> increases. Some records include quality values.
>>>
>>> I typically use bp_fetch.pl to grab a sequence from genbank:
>>>
>>> bp_fetch.pl -fmt fasta net::genbank:AC207960
>>>
>>> sends the fasta sequence to STDOUT. But that bp_fetch.pl wasn't 
>>> designed to pull down quals evidently:
>>>
>>> bp_fetch.pl -fmt qual net::genbank:AC207960
>>>
>>> gives:
>>>
>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>> MSG: You must pass a Bio::Seq::Quality or a Bio::Seq::PrimaryQual 
>>> object to write_seq() as a parameter named "source"
>>> STACK: Error::throw
>>> STACK: Bio::Root::Root::throw /usr/local/perl_5.8/lib/site_perl/
>>> 5.8.8/Bio/Root/Root.pm:359
>>> STACK: Bio::SeqIO::qual::write_seq 
>>>       
>> /usr/local/perl_5.8/lib/site_perl/
>>     
>>> 5.8.8/Bio/SeqIO/qual.pm:205
>>> STACK: /usr/local/perl/bin/bp_fetch.pl:313
>>> -----------------------------------------------------------
>>>
>>> (running under bioperl 1.5.2)
>>>
>>> The quality values for this accession are in genbank as these URLs
>>> demonstrate:
>>>
>>>
>>>       
>> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&id=154937460
>>     
>>>       
>> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&list_uids=15
>>     
>>> 4937460&dopt=fasta
>>>
>>>
>>>       
>> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&list_uids=15
>>     
>>> 4937460&dopt=qual
>>>
>>> What is the best way to pull down these qual values? They aren't 
>>> present in "GenBank(Full)" format. They are present in an ASN.1 
>>> format.
>>>
>>> Advice would be appreciated.
>>>
>>> --
>>> Phillip
>>> Purdue Genomics Core Facility
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>       
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>     
>
>   




More information about the Bioperl-l mailing list