[Bioperl-l] Converting GFF2 records to GFF3
Razi Khaja
razi at genet.sickkids.on.ca
Thu Dec 23 15:54:40 EST 2004
Sorry for cross posting, but this may be relevent to both bioperl and song-devel.
Ive written a small script to convert gff2 records to gff3 using bioperl and vice versa (see gff2_to_gff3.pl and gff3_to_gff2.pl below).
In doing this I have noticed some problems in conversion.
The method Bio::Tools::GFF::_gff3_string will quote attribute values if they contain characters not in [a-zA-Z0-9,;=.:%^*$@!+_?-] (ie. $value = '"'.$value.'"';) and will output empty quotes for tags without values (ie. $value = "\"\"";).
Currently the gff3 spec says: "Unescaped quotation marks, ... are explicitly forbidden."
This brings up 2 questions:
(1) Are quotes necessary in gff3?
(2) When a value is empty, what should be output?
a) Tag="";
b) Tag=.;
c) Tag=;
d) nothing?
(Apart from not meeting the spec, this makes it difficult to do transformations from gff2 to gff3 and back to gff2 again.)
# ===== gff2_to_gff3.pl =====
#!/usr/bin/perl
use strict;
use Bio::Tools::GFF;
my( $gff2File ) = @ARGV;
my $gffio = Bio::Tools::GFF->new(-file=>"$gff2File",
-gff_version=>2);
while( my $feature = $gffio->next_feature() ) {
my $gff3string = $gffio->_gff3_string( $feature );
print "$gff3string\n";
}
$gffio->close();
# ===== gff3_to_gff2.pl =====
#!/usr/bin/perl
use strict;
use Bio::Tools::GFF;
my( $gff3File ) = @ARGV;
my $gffio = Bio::Tools::GFF->new(-file=>"$gff3File", -gff_version=>3);
while( my $feature = $gffio->next_feature() ) {
my $gff2string = $gffio->_gff2_string( $feature );
print "$gff2string\n";
}
$gffio->close();
/**
* Razi Khaja, Bioinformatics Analyst
* The Hospital for Sick Children, Toronto
*/
More information about the Bioperl-l
mailing list