[BioSQL-l] Pubmed-ID's from SwissPort
Hilmar Lapp
hlapp at gnf.org
Tue Aug 30 21:53:28 EDT 2005
The annotation is taken from what's in the source record, so I'm
assuming you're referring to those references that have a PubMed as
well as a MEDLINE ID annotated in the SwissProt record.
If only one ID is provided, that ID will be stored in the database
(using a foreign key in the Reference table to Dbxref), so if the
MEDLINE ID is absent the PubMed ID will substitute for it if it was
present in the source entry. Note that there is no on-the-fly lookup to
whatever site to find out the other ID if only one is given.
If both IDs are present, the relational model right now doesn't permit
you to store both because the relationship between Dbxref and Reference
is 1:n, not n:n. I.e., there is a foreign key in the Reference table,
not an association table between the two.
You could alter the schema and accordingly
Bio/DB/BioSQL/ReferenceAdaptor.pm in bioperl-db in order to store both
IDs, but then you're no longer in sync with the biosql/bioperl-db
development.
If your main goal is to change preference from the MEDLINE ID to the
PubMed ID you can achieve that relatively easily by writing a
SeqProcessor and cheating a little on the reference annotation objects,
e.g. like this (not tested, so may contain typos, but you get the
idea):
package PubmedProcessor;
use vars qw(@ISA);
use strict;
use Bio::Seq::BaseSeqProcessor;
@ISA = qw(Bio::Seq::BaseSeqProcessor);
# check the POD if Bio::Seq::BaseSeqProcessor to understand what
# this method does
sub process_seq {
my ($self,$seq) = @_;
foreach my $ref ($seq->annotation->get_Annotations('reference')) {
# don't bother if there's no pubmed ID anyway
next unless $ref->pubmed();
# cheat that PubMed is Medline to fool the preference order
# in bioperl-db
my $id = $ref->medline();
$ref->medline($ref->pubmed());
$ref->pubmed($id);
}
return ($seq);
}
1;
Then supply the module to load_seqdatabase.pl using the --pipeline
command line argument (see the POD).
Hth,
-hilmar
On Aug 30, 2005, at 2:16 AM, Silke Trissl wrote:
> Hello,
>
> we are using BioSQL to store SwissProt. Currently we only get
> MEDLINE-ID's from the literature references.
>
> My question now is, is there an easy way - like adding an additional
> argument when starting the filling process - to get PubMed ID's from
> SwissProt as well or instead.
>
> We are using BioPerl to fill a PostGreSQL database.
>
> Thanks for any help in advance.
>
> Silke Trissl
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
>
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------
More information about the BioSQL-l
mailing list