[Bioperl-l] Entrez Gene ASN parsers

Liu, Mingyi Mingyi.Liu at gpc-biotech.com
Sun Mar 13 00:17:59 EST 2005


> > My parser does to NCBI's ASN.1 EntrezGene file what an XML 
> parser does 
> > to a yet-to-exist XML-formatted EntrezGene file (or better 
> than it, if 
> > NCBI decides to code Entrez Gene in the XML format that Eutils 
> > provide).
> 
> This is apparently what they will be doing, or at least my 
> understanding of it. 

That's logical, but not good.  I really don't like the XML format Eutils provided.  In fact, I heard few people did.

> The question is how safe are your regexps from possibly unexpected 
> things like escaped quotes or an escaped curly brace that's part of a 
> string and not end of an entity etc or whatever might confuse your 
> regexps.

It's not a problem. In my parsers these situations are dealt with already.  So far, nothing in the latest human, mouse, rat breaks the parser.  I didn't test on other genomes, but they should work fine.

BTW, an unrelated question: Do you know why is it that my reply mails always started new threads in Bioperl-l mailing list archive, whereas others' (like yours) form a nice thread?  

Thanks

Mingyi



More information about the Bioperl-l mailing list