[Biopython-dev] generic format reader interface

Andrew Dalke dalke at acm.org
Mon Apr 9 14:00:43 EDT 2001


Jeff:
>In keeping with the current philosophy, I'd like to keep the formats
>separate.

I agree.  It works with bioperl since they convert the different
formats into a generic object, so putting all the I/O in one
spot is fine.

>However, I worry about the
>N**2-number-of-converters problem.  
It's only N**2 if you want to be as preserving as possible or as
fast as possible.  Otherwise, if there is an intermediate format
which holds the needed data then it reduces to 2*N converters.

That's what bioperl does, but I'm not convinced it stores all of
the data.  Plus, if you really want performance (eg, with conversion
to FASTA) then you loose by going through an extra layer.

> Do you have some ideas of working around that?

Some.  I'm thinking of a registry of builders and converters.
If the requested transformation exists (eg, swissprot input to
SProt objects) then that is returned, else it can look at the
1- or 2- step conversions needed to go from X to Y.

The idea feels right, but I haven't figured out the details.
I present it here in the hope that others have ideas.

                    Andrew
                    dalke at acm.org





More information about the Biopython-dev mailing list