[Biopython] Deprecate Bio.GenBank.Record based GenBank parser?

Peter Cock p.j.a.cock at googlemail.com
Wed Oct 3 14:32:49 UTC 2018


Also logged on GitHub, with a couple of typos fixed:
https://github.com/biopython/biopython/issues/1817

Peter
On Wed, Oct 3, 2018 at 3:30 PM Peter Cock <p.j.a.cock at googlemail.com> wrote:
>
> Hello all,
>
> Am I right in thinking almost everyone working with GenBank
> or EMBL files in Biopython does so via Bio.Seq these days?
>
> Underneath, this calls the scanner/consumer parser defined in
> Bio.GenBank, where the scanner code breaks up the file into
> logical bits which are passed to a consumer which turns them
> into a Biopython data structure. For Bio.SeqIO, we build up a
> SeqRecord object, but there is an alternative consumer which
> builds up Bio.GenBank.Record objects instead.
>
> If you the Bio.GenBank.read(...) or Bio.GenBank.parse(...)
> functions you will get Bio.GenBank.Record objects which are
> a quite direct representation of the underlying data structure,
> and str(...) will give you a GenBank formatted string. Here
> for example, the feature locations are left as plain strings.
>
> Does anyone use the Bio.GenBank.Record based GenBank
> parser? Could we deprecate it (in favour of only using the
> GenBank parser via Bio.SeqIO)? This would mean in a few
> releases time, we could remove the old record class and
> potentially then simplify the GenBank/EMBL parsing.
>
> Peter


More information about the Biopython mailing list