[Bioperl-l] genbank

Jason Stajich jason at bioperl.org
Tue Nov 30 02:19:56 UTC 2010


Dimitar -

In terms of your question - a GenBank db query previously (ie 4-5 years 
ago when this was written) WOULD NOT return a sequence if a GenPept ID 
was specified so we had to have a separate module for GenBank and 
GenPept db querying since there was a different set of parameters - I 
think that changed so that most of the queries can run through GenBank

I see that must have been improved at NCBI.  For the record if you want 
the full GenPept record with features and annotations you just request a 
different db, in this case 'gb' for genbank instead of the fasta source
As in: http://gist.github.com/721012

But maybe you already figured it out?

-jason
Chris Fields wrote, On 11/29/10 6:39 AM:
> On Nov 29, 2010, at 3:35 AM, Dimitar Kenanov wrote:
>
>> Hi again,
>> it seems that when i download (with 'download_query_genbank.pl') the whole proteome from NCBI in fasta format it is first being downloaded and from it is being created some kind of SeqFastaSpeedFactory and after that from it is being copied to the output file. But i want to download and write to output file one by one so i can see the download progress(which is working for genbank data).
>>
>> Its frustrating :)
>>
>> Any ideas where to look for solution
>> Cheers
>> Dimitar
>
> You can't do this with the default script, but you can use a modified version and, where you are retrieving a sequence stream, in the last four lines:
>
> my $stream = $dbh->get_Stream_by_query($query);
> while( my $seq = $stream->next_seq ) {
> 	$out->write_seq($seq);
> }
>
> insert an iterator in the loop that indicates progress.  Realize the sequence data is processed through Bio::SeqIO, so it won't be exactly the same as what is retrieved from GenBank, but it should be very close.
>
> If you want raw sequence, you can use Bio::DB::EUtilities, but it's a bit more complicated.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
Jason Stajich
jason at bioperl.org




More information about the Bioperl-l mailing list