[Bioperl-l] Entrez Gene ASN.1 solution

Wed Apr 13 13:05:25 EDT 2005

Could you please post a description of the Entrez Gene object? I am also 
not very happy with creating Bio::Seq object as I don't think this 
object should be "one size fits all" solution, so I am very curious to 
see what is your design.
I find the indexing very useful for a particular group of people 
(actually we discussed this before and agreed it is a good idea).
I  think having two parsers for the same format is OK for bioperl so I 
don't see any reason for you parser not to be in Bioperl.
Stefan

Mark Lambrecht wrote:

>We have developed our own interface to the NCBI
>Entrez Gene ASN.1 flat files. We needed this
>internally to replace the bioperl LocusLink parser.
>Because we have used so many great bioperl code over
>the last years, we had hoped that people can benefit
>from our work. This system has already proven its
>value , at least for us.
>
>The module consists of the following objects:
>
>     => Bio::_GeneData.pm : abstract engine for
>parsing "type blocks"
> within the NCBI ASN.1 files
>     => Bio::Gene.pm :Entrez Gene object (replaces the
>Bioperl sequence
> object that is normally returned by an IO object) and
>only keeps
> relevant data, can easily be extended to map
>additional needed data
> using the GeneData engine
>     => Bio::GeneIO.pm : iterator derived from RootIO
>(similar to the
> SeqIO objects); implements next_gene method.
> 	
>     subdirectory Index with
>        => Bio::Index::EntrezGene.pm : object with
>capability to index and
> consult an ASN.1 File, inherits from
>Bio::Index::Abstract
>
>     test scripts will be committed too :
>     => few small test records (with extension asn1)
>     => t_gene_indexer.pl : test file to index asn.1
>file and return
> an example record
>
>        #example:
>        my $file = "gene_hs.asn1";
>
>        my $inx = Bio::Index::EntrezGene->new(
>'-filename'   =>
> $file.".inx", '-write_flag' => 'WRITE');
>       
>$inx->make_index("/usr/local/datasets/ncbi/gene/$file");
>     => testGene.pl : tests a Gene objects for return
>of appropriate 
> data fields
>
>        #example for only extracting track info from
>the asn1 file,
> this is a dynamic way of choosing which data to parse
>        my $track_info = new Bio::Gene::GeneTrack;
>
>        $track_info->geneid(1);
>        $gene->type('test_type');
>        $gene->track_info($track_info);
>        print "dump:\n".Dumper($gene)."\n";
>
>Stefan Kirov and Mingyi Liu have produced similar
>solutions (wich we didn't test); we believe that ours
>is different because it is a all-in-one lightweight
>Entrez Gene ASN1 parser that will only capture
>essential data (thereby making it rather fast). We
>deliberately didn't choose to map the data on a Seq
>object. At the same time, a bioperl-compliant indexer
>has been written. 
>We hope that this code can somehow be useful.
>
>We will commit the code to bioperl cvs if people
>agree, as soon as we obtain a login.
>
> Kris Ulens (bioinformatics software developer)
> Mark Lambrecht (scientist bioinformatics)
>
>Galapagos Genomics
>http://www.galapagosgenomics.com
>
>
>		
>__________________________________ 
>Yahoo! Mail Mobile 
>Take Yahoo! Mail with you! Check email on your mobile phone. 
>http://mobile.yahoo.com/learn/mail 
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>  
>

-- 
Stefan Kirov, Ph.D.
University of Tennessee/Oak Ridge National Laboratory
5700 bldg, PO BOX 2008 MS6164
Oak Ridge TN 37831-6164
USA
tel +865 576 5120
fax +865-576-5332
e-mail: skirov at utk.edu
sao at ornl.gov

"And the wars go on with brainwashed pride
For the love of God and our human rights
And all these things are swept aside"