[Bioperl-l] Bio::Tools::GFF parsing error

Marc Logghe Marc.Logghe at DEVGEN.com
Wed Feb 15 16:13:16 UTC 2006


Hi Rob,
According to the GFF Specifications Document @
http://www.sanger.ac.uk/Software/formats/GFF/GFF_Spec.shtml :
<quote>
All of the above described fields should be separated by TAB characters
('\t'). All values of the mandatory fields should not include whitespace
(i.e. the strings for <seqname>, <source> and <feature> fields).
</quote>
Reading that, I am afraid you have to pre-process your gff input file
...
HTH,
Marc


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of 
> Robert Buels
> Sent: Wednesday, February 15, 2006 5:01 PM
> To: bioperl-l at bioperl.org
> Subject: [Bioperl-l] Bio::Tools::GFF parsing error
> 
> Hi all,
> 
> I'm parsing a GFF2 file with Bio::Tools::GFF (I would be 
> using FeatureIO, except it purports not to support gff 2), 
> and the file looks
> like:
> 
> ##gff-version 2
> ##date 2006-02-13
> ##sequence-region C01HBa0088L02.seq 1 120525
> C01HBa0088L02   RepeatMasker    similarity      3537    4267  
>    3.3    
> -       .       Target "Motif:bac_end_repeat_family_345" 1 740
> C01HBa0088L02   RepeatMasker    similarity      4172    4279  
>    2.9    
> +       .       Target "Motif:HRSiTERT00100141" 1 104
> C01HBa0088L02   RepeatMasker    similarity      4267    4323  
>    0.0    
> -       .       Target "Motif:k_29" 150 206
> C01HBa0088L02   RepeatMasker    similarity      4322    4492  
>   26.6    
> +       .       Target "Motif:PRSiTERT00300001" 1960 2129
> C01HBa0088L02   RepeatMasker    similarity      4557    5124  
>   29.5    
> +       .       Target "Motif:PRSiTERT00300001" 2142 2711
> 
> Notice the score column is padded with spaces.
> 
> Bio::Tools::GFF does not like this, and says that ' 3.3' is 
> not a valid score.  My question is, who is wrong here, my 
> input file or Bio::Tools::GFF?  Should Bio::Tools::GFF be 
> able to read this file?
> 
> Rob
> 
> --
> Robert Buels
> SGN Bioinformatics Analyst
> 252A Emerson Hall, Cornell University
> Ithaca, NY  14853
> Tel: 607-255-2360
> rmb32 at cornell.edu
> http://www.sgn.cornell.edu
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 




More information about the Bioperl-l mailing list