[Biopython-dev] GFF parser bug?

Eli Papa elipapa at mit.edu
Sun Apr 25 23:09:51 UTC 2010


Hello,

While trying to use the GFF parser I ran into a value error.

I think it's probably due to one of the GFF3 fields in my file not being
specified as 'key=value', but just as 'value'.

Hope this helps,
eli



In [1]: from BCBio.GFF import GFFExaminer
In [2]: import pprint
In [3]: in_file = "V1.UC-9.scaftig.more500.gff"
In [4]: examiner = GFFExaminer()
In [5]: in_handle = open(in_file)
In [6]: pprint.pprint(examiner.parent_child_map(in_handle))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

/data/elipapa/gutmetahit/SingleSample_GenePrediction/<ipython console>
in <module>()

/home/elipapa/lib/python/bcbio-0.1-py2.4.egg/BCBio/GFF/GFFParser.py in
_file_or_handle_inside(*args, **kwargs)
  705             in_handle = open(in_file)
  706         args = (args[0], in_handle) + args[2:]
--> 707         out = fn(*args, **kwargs)
  708         if need_close:
  709             in_handle.close()

/home/elipapa/lib/python/bcbio-0.1-py2.4.egg/BCBio/GFF/GFFParser.py in
parent_child_map(self, gff_handle)
  789             if line.strip():
  790                 line_type, line_info = _gff_line_map(line,
--> 791                         self._get_local_params())[0]
  792                 if (line_type == 'parent' or (line_type == 'child' and
  793                         line_info['id'])):

/home/elipapa/lib/python/bcbio-0.1-py2.4.egg/BCBio/GFF/GFFParser.py in
_gff_line_map(line, params)
  158             # collect all of the base qualifiers for this item

  159             if len(parts) > 8:
--> 160                 quals, is_gff2 = _split_keyvals(gff_parts[8])
  161             else:
  162                 quals, is_gff2 = dict(), False

/home/elipapa/lib/python/bcbio-0.1-py2.4.egg/BCBio/GFF/GFFParser.py in
_split_keyvals(keyval_str)
   84                 pieces.append(p.strip().split(" "))
   85             key_vals = [(p[0], " ".join(p[1:])) for p in pieces]
---> 86         for key, val in key_vals:
   87             # remove quotes in GFF2 files

   88             if (len(val) > 0 and val[0] == '"' and val[-1] == '"'):

ValueError: need more than 1 value to unpack



*******

The gff file is as follows:

##gff-version 3
##sequence-region scaffold4215_3 1 6526
scaffold4215_3  glimmer gene    3       62      .       -       .
 ID=GL0000006;Name=GL0000006;Lack 3'-end;
scaffold4215_3  glimmer mRNA    3       62      .       -       .
 ID=GL0000006;Name=GL0000006;Parent=GL0000006;Lack 3'-end;
scaffold4215_3  glimmer CDS     3       62      2.84    -       0
 Parent=GL0000006;Lack 3'-end;
scaffold4215_3  glimmer gene    124     1983    .       -       .
 ID=GL0000007;Name=GL0000007;Complete;
[...]



More information about the Biopython-dev mailing list