[Bioperl-l] Bio::Index question
Stephen Wang
bioperl_stephen@yahoo.com
Tue, 10 Dec 2002 12:57:30 -0800 (PST)
Hi all,
I am working on a db linking project that was left
over from a guy who did not do any documentation:-(
The basic flow is like the following:
1) ftp to get databases from LocusLink, GenBank,
OMIM, UniGene, RefSeq, Swiss-Prot and InterPro
2) Use bpindex.pl to index the data files
e.g.,
./bpindex -fmt LOCUSLINK LL
/DBdata/data/LL/files/LL_tmpl
In the bpindex.pl script, the LOCUSLINK.pm module is
called:
/LOCUSLINK/ && do {
$index =
Bio::Index::LOCUSLINK->new("$dir/$name", 'WRITE');
last;
};
3) We use LL ids to link our data to public data.
BioPerl and Boulder/Stone are used
The following using a LocusLink id 5291
$dbobj = Bio::Index::Abstract->new("$dir/$db"); #
$dir/$db is /DBdata/data/index/LL
$rawrecord = $dbobj->get_rSeq_by_id($id); # $id is
5291. The rawrecord is a LocusLink entry for id 5291
$resultstone =
Boulder::LocusLink->parse($rawrecord); #
Llsynonym(2),Map(1),Id_un(1),Llofficialsymbol(1),Id_np_rs(1),Id_om(1),Llgenename(1),Id_nm_rs(1),Organism(1),Acc_gb_mrna(1),Id_ll(1),Llprimaryname(1)
4) The resultstone is then used to get meta
information from the corresponding databases
The following are the problems are I am facing:
1) We currently have Swiss-Prot data of sprot39.dat
and the existing bioperl module has no problem index
it (I do not know what Version of bioperl it is. Is
there a way to find out the bioperl version easily?).
However, when I tried update the Swiss-Prot database
to sprot40.dat, I could create index but I have
problem calling the following:
$dbobj = Bio::Index::Abstract->new("$dir/$db"); #
$dir/$db is /DBdata/data/index/SP, SP is the index
name for Swiss-Prot
2) Then I thought the problem probably would go away
with new BioPerl module. After installing
bioperl-1.0.2 and I tried to use the bpindex.pl to
index my databases. Surprisingly, there is no entry to
index LocusLink database! I would really appreciate
that somebody lets me know what I should do now. Index
modules for OMIM, UNIGENE, RefSeq, etc are also gone
with bioperl-1.0.2.
The following I list what indexings are available in
bioperl-1.0.2
SWITCH : {
/Fasta/ && do {
$index =
Bio::Index::Fasta->new("$dir/$name", 'WRITE');
last;
};
/EMBL/ && do {
$index = Bio::Index::EMBL->new("$dir/$name",
'WRITE');
last;
};
/swiss/ && do {
$index =
Bio::Index::Swissprot->new("$dir/$name", 'WRITE');
last;
};
die("No index format called $fmt");
}
3) This should be an easy question, what is the
difference between EMBL and swiss?
4) Finally, I do not know much about Boulder/Stone
module, do I need to update it for bioperl-1.0.2?
Thanks so much in advance for your help! I really
need here and you know your help is greatly
appreciated!
Stephen
__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com