[Bioperl-l] Forgot to post to the list; DB::Universal
Ewan Birney
birney@ebi.ac.uk
Sat, 26 May 2001 18:12:36 +0100 (BST)
I forgot to post this to the list but last week on one of my daily
london<->hinxton train rides I implemented Lincoln's universal DB idea.
Here is its synopsis
=head1 NAME
Bio::DB::Universal - Artificial database that delegates to specific
databases
=head1 SYNOPSIS
$uni = Bio::DB::Universal->new();
# by default connects to web databases. We can also
# substitute local databases
$embl = Bio::Index::EMBL->new( -filename => '/some/index/filename/locally/stored');
$uni->use_database('embl',$embl);
# treat it like a normal database. Recognises strings
# like gb|XXXXXX and embl:YYYYYY
$seq1 = $uni->get_Seq_by_id("embl:HSHNRNPA");
$seq2 = $uni->get_Seq_by_acc("gb|A000012");
# with no separator, tries to guess database. In this case the
# _ is considered to be indicative of swissprot
$seq3 = $uni->get_Seq_by_id('ROA1_HUMAN');
The magic happens here:
(if anyone can think of better regex's - shout. If there is another
datase we could add *sensibly* that would also be great)
=head2 guess_id
Title : guess_id
Usage :
Function:
Example :
Returns :
Args :
=cut
sub guess_id{
my ($self,$str) = @_;
if( $str =~ /(\S+)[:|\/;](\w+)/ ) {
my $tag;
my $db = $1;
my $id = $2;
if( $db =~ /gb/i || $db =~ /genbank/i || $db =~ /ncbi/i ) {
$tag = 'genbank';
} elsif ( $db =~ /embl/i || $db =~ /emblbank/ || $db =~ /^em/i ) {
$tag = 'embl';
} elsif ( $db =~ /swiss/i || $db =~ /^sw/i || $db =~ /sptr/ ) {
$tag = 'swiss';
} else {
# throw for the moment
$self->throw("Could not guess database type $db from $str");
}
return ($tag,$id);
} else {
my $tag;
# auto-guess from just the id
if( $str =~ /_/ ) {
$tag = 'swiss';
} elsif ( $str =~ /^[QPR]\w+\d$/ ) {
$tag = 'swiss';
} elsif ( $str =~ /[A-Z]\d+/ ) {
$tag = 'genbank';
} else {
# default genbank...
$tag = 'genbank';
}
return ($tag,$str);
}
}
-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>.
-----------------------------------------------------------------