[Biopython-dev] Creating a NCBIFastaIterator

Keith Hughitt keith.hughitt at gmail.com
Fri Oct 7 17:02:30 UTC 2011


It's really just meant to be a bit of "polish." Originally I was thinking
not about having a separate parser but simply extending the existing FASTA
parser to recognize common formats (e.g. NCBI) and choose better ids,
annotations, etc.

Since that would create problems in terms of backwards compatibility,
however, adding a new parser seemed like the next best option.

Part of the goal, personally, was also just to find a small but useful task
I could work on to begin to learn the code and contribute some. It shouldn't
be forced though, so I don't want to contribute something unless it's
actually an improvement.

Keith

On Fri, Oct 7, 2011 at 12:16 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Fri, Oct 7, 2011 at 5:06 PM, Andrew Sczesnak
> <andrew.sczesnak at med.nyu.edu> wrote:
> >>
> >> Maybe it is down to personal preference of coding style?
> >
> > I agree, there isn't much difference between specifying the callback
> > function in parse() or within the loop. To me, this points out that
> > re-implementing a FASTA parser simply for a format of description
> > line seems unnecessary.
> >
> > If a user is interesting in extracting a particular piece of information
> > from a FASTA description and knows the input format of the file, how
> > difficult is it for them to split() it on their own? What exactly are the
> > advantages of a separate parser?
>
> Not enough of an advantage for me personally to have gone
> and written it myself ;)
>
> I can see some benefits in extracting information from the
> NCBI identifier and storing them in the SeqRecord's dbxref
> list and annotation dictionary (as consistently with our other
> parsers as possible) if you are going to want to use those
> fields yourself.
>
> Perhaps Keith can explain his interest with some examples?
>
> Peter
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>



More information about the Biopython-dev mailing list