[Biopython-dev] Martel changes

Jeffrey Chang jchang at smi.stanford.edu
Fri Dec 14 02:01:59 EST 2001


On Wed, Dec 12, 2001 at 01:05:55PM -0700, Andrew Dalke wrote:
> Me:
> >> Is anyone using the iterator facility in Martel?
> 
> Jeff:
> >Yes.  I'm using it in Bio/Medline/NLMMedlineXML to parse the
> >XML-formatted PubMed records.  Each XML file contains about ~30000
> >records and is too big to keep in memory at once.

Oops, I just looked over the code.  I'm in fact not using the
iterator, but thre RecordReader.  Sorry about the confusion!


[adding Word, Integer, ... as built-in expressions]

> When do you use Unprintable?  When do you use Punctuation?

I use them both for matching things in english text.  Sometimes the
text contains unprintable characters from foreign character sets.

> My 'Float' isn't very powerful, as it only understands
> numbers of the form (with optional +/-)
>   1
>   1.
>   1.2
>   .2
> 
> It doesn't handle things like 1E-3, or IEEE values
> like NaN or +Inf.  I could (and probably should) support
> the first of these.  I'm not sure if I should the second.

It gets pretty complicated, e.g.
1.315E2.24

Jeff



More information about the Biopython-dev mailing list