[Bioperl-l] How to retrieve/parse RefSeq contig entries with Bio::DB::Query::GenBank?

Jason Stajich jason.stajich at duke.edu
Sun Dec 5 15:49:28 EST 2004


On Dec 5, 2004, at 3:30 PM, Rumen Kostadinov wrote:

> well I use the following:
>
> 1.open ncbi
> 2.type the accession NT_079581
> 3.click on "Click here to see the sequence of this contig record." - a
> recent feature they added that displays the whole sequence with
> features, etc.
> 4.copy/paste it into my web program
> and use
> my $stringfh = new IO::String($seqstr);
> my $stream = new Bio::SeqIO(-fh => $stringfh,
>                                -format => 'GenBank');
> while (my $seq = $stream->next_seq) {
>  bla;
> }
> to proceed.

> it works perfectly, but I wanted to use my easier method of
> retrieving by just pasteing the accession number
> and letting Bio::DB::Query::GenBank do the job.
>
Yep would be nice to have - but we're limited by the interfaces made 
available through the ncbi tools scripts.  The full genbank record is 
not what the web query returns but the contig format where there are 
cross references to the accession numbers of the assembled pieces that 
make up the contig.

So someone has to figure out how to make it work, probably by parsing 
the contig format and then doing additional subqueries.

>
> Thanks for your response!
> Rumen Kostadinov
>
>
>
> On Sun, 5 Dec 2004 15:25:21 -0500, Jason Stajich 
> <jason.stajich at duke.edu> wrote:
>> I think we deliberately bail on these - need someone to figure out how
>> to parse them correctly.
>>
>> -jason
>>
>>
>> On Dec 5, 2004, at 3:14 PM, Rumen Kostadinov wrote:
>>
>>> Hi,
>>>
>>>
>>> Is there a way to retrieve and parse RefSeq contig entries with 
>>> bioperl
>>> using the Bio::DB::Query::GenBank?
>>>
>>> e.g.
>>> NT_079581
>>>
>>> I get weird parsing when doing:
>>>
>>>    my $query_string = param('query');
>>>    my $query = Bio::DB::Query::GenBank->new(-db=>'nucleotide',
>>>                                             -query=>$query_string);
>>>    my $count = $query->count;
>>>    my @ids   = $query->ids;
>>>
>>>    # get a genbank database handle
>>>    my $gb = new Bio::DB::GenBank;
>>>
>>>    my $stream = $gb->get_Stream_by_query($query);
>>>    while (my $seq = $stream->next_seq) {
>>>        print "<tr><td>";
>>>        print '<a href="map.pl?acc='.$seq->accession_number().';">';
>>>        print $seq->accession_number(), '</a> ';
>>>        print "<td>", $seq->desc(), br;
>>>    }
>>>
>>> Sincerely,
>>> Rumen Kostadinov
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> --
>> Jason Stajich
>> jason.stajich at duke.edu
>> http://www.duke.edu/~jes12/
>>
>>
>
>
--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/



More information about the Bioperl-l mailing list