[Bioperl-l] Bio::Index::GenBank - by organism?
Jason Stajich
jason at bioperl.org
Tue Nov 10 18:50:00 UTC 2009
You might also look at what mygenbank does:
http://homepage.mac.com/iankorf/mygenbank.html
On Nov 9, 2009, at 7:55 PM, Chris Fields wrote:
> On Nov 9, 2009, at 6:05 PM, Jay Hannah wrote:
>
>> Many thanks to Ewan Birney et. al. for Bio::Index::*
>>
>> I can throw away my awful grep based index-by-accession stuff. :)
>>
>> Any chance someone has also written an organism based index
>> mechanism? Something like...
>>
>> while (my $seq = $inx−>get_Seq_by_organism('*Xanthomonas*')) {
>> print $seq->display_id . "\n";
>> }
>>
>> Thanks,
>>
>> j
>
> It should work via id_parser(); from Bio::Index::GenBank:
>
> $inx->id_parser(\&get_id);
> # make the index
> $inx->make_index($file_name);
>
> # here is where the retrieval key is specified
> sub get_id {
> my $line = shift;
> $line =~ /clone="(\S+)"/;
> $1;
> }
>
> Change the code ref deal with the line you want and parse the name
> out. Caveat: this may not be absolutely perfect (it only passes in
> a line at a time, and some species lines will wrap). Also not sure
> how this would work in cases where multiple sequences from the same
> species are present.
>
> The other option is to preparse everything and tie a hash to store a
> species->UID map, then use that along with your Bio::Index index to
> grab what you need.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org
More information about the Bioperl-l
mailing list