[Bioperl-l] Entrez Gene ASN.1 solution
Stefan Kirov
skirov at utk.edu
Wed Apr 13 13:05:25 EDT 2005
Could you please post a description of the Entrez Gene object? I am also
not very happy with creating Bio::Seq object as I don't think this
object should be "one size fits all" solution, so I am very curious to
see what is your design.
I find the indexing very useful for a particular group of people
(actually we discussed this before and agreed it is a good idea).
I think having two parsers for the same format is OK for bioperl so I
don't see any reason for you parser not to be in Bioperl.
Stefan
Mark Lambrecht wrote:
>We have developed our own interface to the NCBI
>Entrez Gene ASN.1 flat files. We needed this
>internally to replace the bioperl LocusLink parser.
>Because we have used so many great bioperl code over
>the last years, we had hoped that people can benefit
>from our work. This system has already proven its
>value , at least for us.
>
>The module consists of the following objects:
>
> => Bio::_GeneData.pm : abstract engine for
>parsing "type blocks"
> within the NCBI ASN.1 files
> => Bio::Gene.pm :Entrez Gene object (replaces the
>Bioperl sequence
> object that is normally returned by an IO object) and
>only keeps
> relevant data, can easily be extended to map
>additional needed data
> using the GeneData engine
> => Bio::GeneIO.pm : iterator derived from RootIO
>(similar to the
> SeqIO objects); implements next_gene method.
>
> subdirectory Index with
> => Bio::Index::EntrezGene.pm : object with
>capability to index and
> consult an ASN.1 File, inherits from
>Bio::Index::Abstract
>
> test scripts will be committed too :
> => few small test records (with extension asn1)
> => t_gene_indexer.pl : test file to index asn.1
>file and return
> an example record
>
> #example:
> my $file = "gene_hs.asn1";
>
> my $inx = Bio::Index::EntrezGene->new(
>'-filename' =>
> $file.".inx", '-write_flag' => 'WRITE');
>
>$inx->make_index("/usr/local/datasets/ncbi/gene/$file");
> => testGene.pl : tests a Gene objects for return
>of appropriate
> data fields
>
> #example for only extracting track info from
>the asn1 file,
> this is a dynamic way of choosing which data to parse
> my $track_info = new Bio::Gene::GeneTrack;
>
> $track_info->geneid(1);
> $gene->type('test_type');
> $gene->track_info($track_info);
> print "dump:\n".Dumper($gene)."\n";
>
>Stefan Kirov and Mingyi Liu have produced similar
>solutions (wich we didn't test); we believe that ours
>is different because it is a all-in-one lightweight
>Entrez Gene ASN1 parser that will only capture
>essential data (thereby making it rather fast). We
>deliberately didn't choose to map the data on a Seq
>object. At the same time, a bioperl-compliant indexer
>has been written.
>We hope that this code can somehow be useful.
>
>We will commit the code to bioperl cvs if people
>agree, as soon as we obtain a login.
>
> Kris Ulens (bioinformatics software developer)
> Mark Lambrecht (scientist bioinformatics)
>
>Galapagos Genomics
>http://www.galapagosgenomics.com
>
>
>
>__________________________________
>Yahoo! Mail Mobile
>Take Yahoo! Mail with you! Check email on your mobile phone.
>http://mobile.yahoo.com/learn/mail
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
--
Stefan Kirov, Ph.D.
University of Tennessee/Oak Ridge National Laboratory
5700 bldg, PO BOX 2008 MS6164
Oak Ridge TN 37831-6164
USA
tel +865 576 5120
fax +865-576-5332
e-mail: skirov at utk.edu
sao at ornl.gov
"And the wars go on with brainwashed pride
For the love of God and our human rights
And all these things are swept aside"
More information about the Bioperl-l
mailing list