[Bioperl-l] new GFF2 parsing/dumping routines committed...
Mark Wilkinson
mwilkinson@gene.pbi.nrc.ca
Mon, 20 Nov 2000 17:03:01 -0600
Hi Group!
the last two commits from me allow the import/export of (as far as I understand) properly formatted GFF2.
to create generic features from GFF2 you would make the call as follows:
$Feature = new SeqFeature::Generic (-gff2_string => $string);
The most important differences between the original -gff_string and the new -gff2_string options are as follows
(1) the fields *must be TAB-separated* (formerly it was splitting on whitespace, but that would choke on the freetext that is now allowed)
(2) there is no default "group" tag created. You must specify group=MyGroup in the attributes field
(3) tag/value units are semicolon separated
(4) tags can have more than one space-separated value
(5) free-text is allowed as a value so long as it is double-quoted.
(6) comments are allowed but are ignored (comments are at the end of the GFF line preceeded by a # symbol)
and example of a GFF string that could be parsed by this routine would be:
mysequence GMHMM exon 100 200 45 . . group=MyFavGene;notes="the answer" "to LtUandE is" 42 # these are comments
this results in a feature with the following structure:
0 Bio::SeqFeature::Generic=HASH(0x844db70)
'_gsf_end' => 200
'_gsf_score' => 45
'_gsf_seqname' => 'abc'
'_gsf_start' => 100
'_gsf_strand' => 0
'_gsf_sub_array' => ARRAY(0x84507e8)
empty array
'_gsf_tag_hash' => HASH(0x845074c)
'group' => ARRAY(0x845116c)
0 'MyFavGene'
'notes' => ARRAY(0x845122c)
0 'the answer'
1 'to LtUandE is'
2 '42'
'_parse_h' => HASH(0x8437dfc)
empty hash
'_primary_tag' => 'exon'
'_record_err' => undef
'_source_tag' => 'GMHMM'
'_strict' => undef
'_verbose' => undef
If you are so inclined please give this a thorough working over and let me know if you find errors. So far it seems to be okay... touch wood!
Cheers all!
Mark
--
---
Dr. Mark Wilkinson
Bioinformatics Group
National Research Council of Canada
Plant Biotechnology Institute
110 Gymnasium Place
Saskatoon, SK
Canada