[Bioperl-l] GFF3
Rob Edwards
rob at salmonella.org
Sat Jan 15 18:22:23 EST 2005
Because I need it for some things that I am doing, I have worked quite
a bit on the GFF3 parser Bio::FeatureIO::gff. Several people have
written this module, I have just made some cosmetic changes:
I have improved the validation processes that are applied as a gff3
file is parsed, and the module should now validate essentially
everything in the file except alignments. Validation is optional and is
based on the specification described at :
http://song.sourceforge.net/gff3.shtml
For clarification and edification I have created a couple of tables
describing the module and the validation that is applied to GFF3 files,
which you can see online: http://www.salmonella.org/bioperl/gff3.html
I also wrote a Bio::SeqIO::gff module. Since gff3 files can hold
sequences, it seems that you'd want to be able to call the next_seq
methods, and therefore SeqIO is more appropriate than FeatureIO for
those aspects. Currently the SeqIO module uses the FeatureIO module for
parsing the file, it just reorganizes things.
This provides two different interfaces for getting objects out of GFF3
files:
Bio::FeatureIO::gff will return Bio::SeqFeature::Annotated objects
representing the annotations.
Bio::SeqIO::gff will return Bio::Seq objects representing the
sequences with all the annotations attached.
The other difference between the two is that the former passes out the
objects as they are read, but the latter has to read the whole file to
get the annotations and the sequences.
At the moment I focussed on reading GFF3 files.
I have not committed these to cvs yet, pending comments from others. I
have some specific questions:
Should I wait until after 1.5 is out?
Is two separate modules really the right way to go about this?
What about other GFF modules (like Bio::Tools::GFF)?
Could someone give the modules a workout and let me know about bugs? I
am sure there are many.
I have posted these modules online via anonymous ftp at
ftp://ftp.salmonella.org/rob/bioperl/GFF_modules.tgz
Take a look and let me know what you do and don't like!
Rob
More information about the Bioperl-l
mailing list