[Biopython-dev] fpc and gff

Peter biopython at maubp.freeserve.co.uk
Mon Sep 28 11:52:56 UTC 2009


On Mon, Sep 28, 2009 at 12:36 PM, Jose Blanca <jblanca at btc.upv.es> wrote:
> Sorry for the previous incomplete mail. :(
>
> Hi:
> I'm interested in parsing an fpc physical map and writing a gff3 file from it.
> That's done by the fpc people in bioperl and they go from fpc to gff2. I
> would like to do it in python.
> I've written the fpc parser looking at the bioperl one. You can take a look
> at:
> http://bioinf.comav.upv.es/svn/biolib/biolib/src/biolib/fpc.py
>
> Now I have to create the gff structure and writer. I've been reading Brad's
> code regarding the GFF parser and writer. I would like to integrate my fpc
> work as much as posible with biopython and if you like it we could add the
> fpc to Biopython in the future.
> But I have not a clear idea on the relation between GFF and SeqFeature. The
> main problem is the subfeature and the gff feature hierarchy. My take on that
> at the moment is to write a GFFfeature class similar to the gff feature with
> seqid, source, type, start, end, score, etc. and go from the fpc to
> GFFFeature objects. I know that this would not integrate nicely with
> BioPython. Could you give some hint on how to do it in a proper way?
> Best regards,

Right now there isn't a "proper way" as Brad's GFF code hasn't
been integrated into Biopython yet.

I think Brad was thinking of using the SeqFeature object "as is" to hold
GFF features, with the sub-features list used for the hierarchy.

Michiel and I had suggested a simpler structure more faithful to the
GFF model might be useful - even if it was just a standardised tuple
of the start, end, strand, id, etc, and an annotation dictionary). For
the SeqIO interface, these GFF features would have to be turned
into normal SeqFeature objects of course.

Peter



More information about the Biopython-dev mailing list