[Biopython] Increase line length when writing EMBL format

Pedro Almeida p.almeida.mc at gmail.com
Fri Sep 18 12:32:52 UTC 2020


Dear BioPython Developers and enthusiasts,

I’m working in a script to perform some modifications in an EMBL file format I have at hand. Everything seems to be working OK, except for some features where `SeqIO.write(record, fh, 'embl')` seems to be writing the last closing quote (`"`) in a new line as a feat of its own.

Here’s how the original feature is:

```
FT                   /standard_name="species:rnd-4_family-1331|genus:Unspecified"
```

but with  `SeqIO.write` gets printed in 2 lines as:

```
FT                   /standard_name="species:rnd-4_family-1331|genus:Unspecified
FT                   "
```

I remember seeing (can’t remember where though) that the ‘embl’ format uses for the most part the genbank structure, so thought that increasing the value of `record.GB_LINE_LENGTH` say to 100 `record.GB_LINE_LENGTH=100` could work, but it doesn’t…

I actually think that `record.GB_LINE_LENGTH` is not taken into account with ‘embl’ writing format because the default value seems to be [79](https://biopython.org/docs/1.75/api/Bio.GenBank.Record.html#Bio.GenBank.Record.Record.GB_LINE_LENGTH) but by default it prints the line above with a width of 81.

Any ideas/suggestions on how to work around this? I could probably write another parser to correct for this but would be easier/better if this could be worked with BioPython.

Many thanks,
Pedro







More information about the Biopython mailing list