[Bioperl-l] Bio::Index question

Stephen Wang bioperl_stephen@yahoo.com
Tue, 10 Dec 2002 12:57:30 -0800 (PST)


Hi all,

	I am working on a db linking project that was left
over from a guy who did not do any documentation:-(
	
	The basic flow is like the following:
	
	1) ftp to get databases from LocusLink, GenBank,
OMIM, UniGene, RefSeq, Swiss-Prot and InterPro
	
	2) Use bpindex.pl to index the data files
	
		e.g., 
		
		./bpindex -fmt LOCUSLINK LL
/DBdata/data/LL/files/LL_tmpl
		
		In the bpindex.pl script, the LOCUSLINK.pm module is
called:
		
		    /LOCUSLINK/ && do {
		        $index =
Bio::Index::LOCUSLINK->new("$dir/$name", 'WRITE');
		        last;
		    };
		    
	3) We use LL ids to link our data to public data.
BioPerl and Boulder/Stone are used

		The following using a LocusLink id 5291

		$dbobj = Bio::Index::Abstract->new("$dir/$db"); #
$dir/$db is /DBdata/data/index/LL

		$rawrecord = $dbobj->get_rSeq_by_id($id); # $id is
5291. The rawrecord is a LocusLink entry for id 5291

		$resultstone =
Boulder::LocusLink->parse($rawrecord); #
Llsynonym(2),Map(1),Id_un(1),Llofficialsymbol(1),Id_np_rs(1),Id_om(1),Llgenename(1),Id_nm_rs(1),Organism(1),Acc_gb_mrna(1),Id_ll(1),Llprimaryname(1)

	4) The resultstone is then used to get meta
information from the corresponding databases
	
	
	The following are the problems are I am facing:
	
	1) We currently have Swiss-Prot data of sprot39.dat
and the existing bioperl module has no problem index
it (I do not know what Version of bioperl it is. Is
there a way to find out the bioperl version easily?).
However, when I tried update the Swiss-Prot database
to sprot40.dat, I could create index but I have
problem calling the following:
	
		$dbobj = Bio::Index::Abstract->new("$dir/$db"); #
$dir/$db is /DBdata/data/index/SP, SP is the index
name for Swiss-Prot
		
	2) Then I thought the problem probably would go away
with new BioPerl module. After installing
bioperl-1.0.2 and I tried to use the bpindex.pl to
index my databases. Surprisingly, there is no entry to
index LocusLink database! I would really appreciate
that somebody lets me know what I should do now. Index
modules for OMIM, UNIGENE, RefSeq, etc are also gone
with bioperl-1.0.2.
	
		The following I list what indexings are available in
bioperl-1.0.2
		
		SWITCH : {
		    /Fasta/ && do {
		        $index =
Bio::Index::Fasta->new("$dir/$name", 'WRITE');
		        last;
		    };
		    /EMBL/ && do {
		        $index = Bio::Index::EMBL->new("$dir/$name",
'WRITE');
		        last;
		    };
		    /swiss/ && do {
		        $index =
Bio::Index::Swissprot->new("$dir/$name", 'WRITE');
		        last;
		    };
		    die("No index format called $fmt");
		}

	3) This should be an easy question, what is the
difference between EMBL and swiss?
	
	4) Finally, I do not know much about Boulder/Stone
module, do I need to update it for bioperl-1.0.2?
	
	Thanks so much in advance for your help! I really
need here and you know your help is greatly
appreciated!
	
	Stephen


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com