[Bioperl-l] genbank

Dimitar Kenanov dimitark at bii.a-star.edu.sg
Tue Nov 30 01:50:42 UTC 2010


On 11/29/2010 10:39 PM, Chris Fields wrote:
> On Nov 29, 2010, at 3:35 AM, Dimitar Kenanov wrote:
>
>    
>> Hi again,
>> it seems that when i download (with 'download_query_genbank.pl') the whole proteome from NCBI in fasta format it is first being downloaded and from it is being created some kind of SeqFastaSpeedFactory and after that from it is being copied to the output file. But i want to download and write to output file one by one so i can see the download progress(which is working for genbank data).
>>
>> Its frustrating :)
>>
>> Any ideas where to look for solution
>> Cheers
>> Dimitar
>>      
> You can't do this with the default script, but you can use a modified version and, where you are retrieving a sequence stream, in the last four lines:
>
> my $stream = $dbh->get_Stream_by_query($query);
> while( my $seq = $stream->next_seq ) {
> 	$out->write_seq($seq);
> }
>
> insert an iterator in the loop that indicates progress.  Realize the sequence data is processed through Bio::SeqIO, so it won't be exactly the same as what is retrieved from GenBank, but it should be very close.
>
> If you want raw sequence, you can use Bio::DB::EUtilities, but it's a bit more complicated.
>
> chris
>    
Hi,
thank you for the info.
I already have inserted a progress bar(Term::ProgressBar) in the last 
four lines. The problem is that i see the progress at the end. I see 
directly 100%done. See the attached script.
What i was reading in the modules underlying the script the way the 
stream is constructed it should be able to be read from while is being 
downloaded. But when i get fasta seqs with NCBI rettype=fasta it is not 
possible.

-- 
Dimitar Kenanov
Post doctoral fellow
Bioinformatics Institute
A*STAR Singapore
tel: +65 6478 8514
email: dimitark at bii.a-star.edu.sg

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: download_query_genbank.pl
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20101130/f288933b/attachment-0004.pl>


More information about the Bioperl-l mailing list