[Bioperl-l] FeatureI GFF output is not GFF version 2 compatible?
Mark Wilkinson
mwilkinson@gene.pbi.nrc.ca
Thu, 09 Nov 2000 14:58:22 -0600
Hi all!
Can someone clarify if my understanding is correct: According to the
GFF specifications page at Sanger, under the GFF version 2 format
"From version 2 onwards, the attribute field must have an tag value
structure following the syntax used within objects in a .ace file,
flattened onto one line by semicolon separators. Tags must be standard
identifiers ([A-Za-z][A-Za-z0-9_]*). Free text values must be quoted
with double quotes."
I just dumped a bunch of SeqFeatures using $Feature->gff_string and got
output as follows:
PBICTGAt_2_000022 NCBI NCBI_Gene 23468 24995 . - . length=509 contig_stop=4995 chr_id=3 contig_start=3468 comment=Gene=At2g01500 Synonym=F2I9.12 Product=putative homeodomain transcription factor
it appears that neither of these two specifications are being followed
by the ->gff_string subroutine, i.e. the attributes are space-separated
not semicolon separated, and the free text is not quoted. Is it my
mis-understanding of the GFF format, or is this a bug in the module (or
is the module not meant to be GFF version 2 compatible?)...(though the
documentation says that it is...)
??????
any advice appreciated!!
If I get the OK from you all I could go in and "fix" it myself, but I
want to make sure I don't step on anyone's toes/break anyone's parser
before doing so.
Cheers!
Mark
--
---
Dr. Mark Wilkinson
Bioinformatics Group
National Research Council of Canada
Plant Biotechnology Institute
110 Gymnasium Place
Saskatoon, SK
Canada