[BioRuby] Parsing GFF3 attributes

Toshiaki Katayama ktym at hgc.jp
Fri May 18 15:23:51 UTC 2007


Hien,

Thank you for your report.

In bio/db/gff.rb, we have Bio::GFF::GFF2 for version 2 spec and Bio::GFF::GFF3 for version 3
and I added your modification to the Bio::GFF::GFF3 class.

Personally, I have not yet use GFF3 intensively, so if you think the class
should have more functionality to support new features in GFF3, please propose.

Toshiaki

On 2007/05/16, at 1:10, Michael Han wrote:

>
> On 15 May 2007, at 16:30, hienle at club-internet.fr wrote:
>> Hello all,
>>
>> I am working with a GFF3-formatted file and have noticed that the  
>> attributes field is not parsed properly.
>>
>> In bio/db/gff.rb,
>>
>>     75      def parse_attributes(attributes)
>>     76        hash = Hash.new
>>     77        attributes.split(/[^\\];/).each do |atr|
>>     78          key, value = atr.split(' ', 2)
>>     79          hash[key] = value
>>     80        end
>>     81        return hash
>>     82      end
>>     83    end
>>
>> I changed :
>>     78          key, value = atr.split(' ', 2)
>> to:
>>     78          key, value = atr.split('=', 2)
>>
>> and it now appears to behave properly. However, I am not certain if  
>> this is appropriate for backward compatibility with GFF and GFF2.
>
> I use normally spaces between the key and the value of the attributes  
> for GFF2 like: Gene "1234" ; Transcript "1234"
> as described in <"http://www.sanger.ac.uk/Software/formats/GFF/ 
> GFF_Spec.shtml">
>
> so it would break  GFF2 / GFF parsing.
> Maybe you could create a separate GFF3 parser inheriting from the  
> gff.rb .
>
> some GFF3 reference (note: last version from a few weeks ago)
> <"http://www.sequenceontology.org/gff3.shtml">
>
>> Is anyone working on parsing GFF3 files?
>>
>> Thank you in advance for your help,
>> -Hien
>
> MIchael
> _______________________________________________
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby




More information about the BioRuby mailing list