[Bioperl-l] Entrez parser

Mingyi Liu mingyi.liu at gpc-biotech.com
Thu Apr 14 16:36:34 EDT 2005


Stefan Kirov wrote:

> due to a small glitch in Bio::ASN1::EntrezGene, first record is empty. 
> Mingyi knows about that.

Yes, I just released a new version of Bio::ASN1::EntrezGene that fixed 
this bug.  The new release also fixed a minor line number counting bug 
that happens when user parses multiple files using one parser object.

But the bigger changes in this release include:

Added a fast indexer (It would take some effort for Mark to add a 
different type of object in the returning value for their indexer, so I 
decided to take advantage of the excellent bioperl index code base and 
develope one.  This indexer indexes human file in 21 seconds on one Xeon 
2.4 GHz CPU).  The return value of the indexer could be either Bio::Seq 
object produced by Stefan's entrezgene.pm or the data hash produced by 
Bio::ASN1::EntrezGene.  Since this indexer lives as 
Bio::ASN1::EntrezGene::Indexer.pm there will not be conflict with Mark's 
indexer (Bio::Index::EntrezGene).  It is useful for those that want to 
retrieve Stefan's Entrez Gene Bio::Seq objects through an indexer.

Added test scripts
Added new convenient methods (rawdata() and fh())
Now file handles are accepted too (by new() and fh().  new() also now 
accept '-file', '-fh', 'fh' in addition to 'file')
Updated documentation

The new version is available at http://sourceforge.net/projects/egparser 
already.  CPAN's not refreshed yet.

Best,

Mingyi



More information about the Bioperl-l mailing list