[Biopython-dev] SeqIO and qual: Question about reading and writing qual files

Peter biopython at maubp.freeserve.co.uk
Tue Mar 24 15:13:40 UTC 2009


On Tue, Mar 24, 2009 at 2:59 PM, Sebastian Bassi
<sbassi at clubdelarazon.org> wrote:
> But anyway, regarding this:
>
>> This was one area of the new SeqRecord slicing I was a little unsure
>> about - slicing a qual file's SeqRecord (or any SeqRecord with a None
>> for the sequence).  I hadn't done anything about it immediately as I
>> couldn't think of a use case for it - so that's solved ;)
>> One solution would be to introduce an UnknownSeq object, which
>> ....
>
> I agree with the need of an UnknownSeq object for modify the size of
> the qual file.

Suppose you read in a qual file (or a GenBank file with no sequence, just a
CONTIG line), and instead of None, the SeqRecord object(s) had a new
UnknownSeq object saying they where made up of a given number of "N"
characters using a DNA alphabet. What would you expect to get if you
used Bio.SeqIO to write out the file in FASTA format?  To my mind there
are two sensible options - write out the file using the "NNN....N"
sequence, or raise an error.

Peter




More information about the Biopython-dev mailing list