[Bioperl-l] Bio::Index question
Hilmar Lapp
hlapp@gnf.org
Tue, 10 Dec 2002 13:18:21 -0800
Well I guess one of the omitted pieces of documentation was that that guy wrote those modules you're missing himself and you just blew them away. I don't remember there having been indexing modules for LocusLink, Unigene, RefSeq, or OMIM ever in the repository.
We do have parsers for all these formats including LocusLink meanwhile though. To get the parsers you'd have to install 1.1.1, or wait 3 more weeks and install the then-to-be-released v1.2.
It's probably not that hard to write indexing modules once the parsers are in place. Would you like to take on that job? Would be greatly welcome ...
-hilmar
> -----Original Message-----
> From: Stephen Wang [mailto:bioperl_stephen@yahoo.com]
> Sent: Tuesday, December 10, 2002 12:58 PM
> To: bioperl-l@bioperl.org
> Subject: [Bioperl-l] Bio::Index question
>
>
> Hi all,
>
> I am working on a db linking project that was left
> over from a guy who did not do any documentation:-(
>
> The basic flow is like the following:
>
> 1) ftp to get databases from LocusLink, GenBank,
> OMIM, UniGene, RefSeq, Swiss-Prot and InterPro
>
> 2) Use bpindex.pl to index the data files
>
> e.g.,
>
> ./bpindex -fmt LOCUSLINK LL
> /DBdata/data/LL/files/LL_tmpl
>
> In the bpindex.pl script, the LOCUSLINK.pm module is
> called:
>
> /LOCUSLINK/ && do {
> $index =
> Bio::Index::LOCUSLINK->new("$dir/$name", 'WRITE');
> last;
> };
>
> 3) We use LL ids to link our data to public data.
> BioPerl and Boulder/Stone are used
>
> The following using a LocusLink id 5291
>
> $dbobj = Bio::Index::Abstract->new("$dir/$db"); #
> $dir/$db is /DBdata/data/index/LL
>
> $rawrecord = $dbobj->get_rSeq_by_id($id); # $id is
> 5291. The rawrecord is a LocusLink entry for id 5291
>
> $resultstone =
> Boulder::LocusLink->parse($rawrecord); #
> Llsynonym(2),Map(1),Id_un(1),Llofficialsymbol(1),Id_np_rs(1),I
> d_om(1),Llgenename(1),Id_nm_rs(1),Organism(1),Acc_gb_mrna(1),I
> d_ll(1),Llprimaryname(1)
>
> 4) The resultstone is then used to get meta
> information from the corresponding databases
>
>
> The following are the problems are I am facing:
>
> 1) We currently have Swiss-Prot data of sprot39.dat
> and the existing bioperl module has no problem index
> it (I do not know what Version of bioperl it is. Is
> there a way to find out the bioperl version easily?).
> However, when I tried update the Swiss-Prot database
> to sprot40.dat, I could create index but I have
> problem calling the following:
>
> $dbobj = Bio::Index::Abstract->new("$dir/$db"); #
> $dir/$db is /DBdata/data/index/SP, SP is the index
> name for Swiss-Prot
>
> 2) Then I thought the problem probably would go away
> with new BioPerl module. After installing
> bioperl-1.0.2 and I tried to use the bpindex.pl to
> index my databases. Surprisingly, there is no entry to
> index LocusLink database! I would really appreciate
> that somebody lets me know what I should do now. Index
> modules for OMIM, UNIGENE, RefSeq, etc are also gone
> with bioperl-1.0.2.
>
> The following I list what indexings are available in
> bioperl-1.0.2
>
> SWITCH : {
> /Fasta/ && do {
> $index =
> Bio::Index::Fasta->new("$dir/$name", 'WRITE');
> last;
> };
> /EMBL/ && do {
> $index = Bio::Index::EMBL->new("$dir/$name",
> 'WRITE');
> last;
> };
> /swiss/ && do {
> $index =
> Bio::Index::Swissprot->new("$dir/$name", 'WRITE');
> last;
> };
> die("No index format called $fmt");
> }
>
> 3) This should be an easy question, what is the
> difference between EMBL and swiss?
>
> 4) Finally, I do not know much about Boulder/Stone
> module, do I need to update it for bioperl-1.0.2?
>
> Thanks so much in advance for your help! I really
> need here and you know your help is greatly
> appreciated!
>
> Stephen
>
>
> __________________________________________________
> Do you Yahoo!?
> Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
> http://mailplus.yahoo.com
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>