[Bioperl-l] ASN.1 and BioPerl ?
Peter.Robinson at t-online.de
Peter.Robinson at t-online.de
Sat Feb 12 16:37:56 EST 2005
On Sat, Feb 12, 2005 at 01:20:30PM -0800, Hilmar Lapp wrote:
> The ASN.1 parser would be very useful, in particular for implementing
> the NCBI Gene parser I suppose.
>
> I do suggest though that you publish this as a separate module on CPAN,
> as supposedly it is (or meant to be?) generically useful, so I
> completely agree with Chris on this.
I also agree that it would be better to have the module on CPAN; if you
been inspired to use the module to incorporate Entrez Gene into BioPerl I
would be happy to help out as I can. My initial experiences with this suggest it will not be easy.
>
> I need an NCBI Gene parser implemented in the Bio::SeqIO framework
> returning compatible Bio::SeqI objects within the next few weeks. The
> speed needs to be at least several records per second, ideally 10/s or
> higher.
>
> My understanding is that Peter has a grammar-based parser in Java
> (speed I don't know), and Steve has a Parse::RecDescent-based parser in
> perl (not bioperl) which is (expectedly) slow.
>
> I've seen Graham Barr's module on CPAN but haven't tried it yet; it
> seemed to me that you need the ASN model definition to start with,
> which I haven't seen at any obvious or not-so-obvious place on the NCBI
> ftp site, so I either missed something or you have to download the
> entire toolkit or something else.
You might want to take a look at this
http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/objects/entrezgene/entrezgene.asn
note that there appear to be some inconsistencies between some Entrez Gene records and this specification (or I have misunderstood something).
After having played around with perl, bioperl, lec/yacc and more recently antlr, I have the impression that this is a doable task using antlr and a modest amount of Java code. (Doable meaning it is possible to extract the information one wants from a species-specific ASN.1 Gene file). Given my schedule I don't know when I will be able to finish this, but I will send the list a mail presuming there is no bioperl tool to do this by then.
-Peter
More information about the Bioperl-l
mailing list