[Biopython-dev] SeqIO and qual: Question about reading and writing qual files

Sebastian Bassi sbassi at clubdelarazon.org
Tue Mar 24 06:24:38 UTC 2009


I have a .fasta file and its corresponding .qual file.
I run seqclean on the fasta file and I got a shorter .fasta file as
output (that is expected).
Using the .cln file from seqclean, I want to "trim" the .qual file the
same way my new fasta is trimmed.
I can read the cln and parse the information of "where to trim".
For example, in one original sequence of 1000 bp, I may need to trim
from 150 to 800.
The problem is that I can't modify qual values using the new SeqIO
qual parser (at least the size of the list can't be modified). I read
the example in the doc, where it is cut doing something like:
sub_rec = fullrec[150:800]
But, this works only when there is a sequence (so, when read it as
"fastq"), but it doesn't work when the sequence is read as "qual"
(because there is no sequence and in this case I can't modify the
length of the list in letter_annotations['phred_quality'], it is true
that I can modify qual values in the list, but I want to modify list
size).
Here is the error:
Traceback (most recent call last):
  File "/home/sbassi/bioinfo/INTA/qualparser.py", line 18, in <module>
    s.letter_annotations['phred_quality'] = [0,0,0,0,10,1]
  File "/home/sbassi/test/virtualenv-1.3.2/t6/lib/python2.5/site-packages/biopython-1.49-py2.5-linux-i686.egg/Bio/SeqRecord.py",
line 33, in __setitem__
    "strings) of length %i." % self._length)
TypeError: We only allow python sequences (lists, tuples or strings)
of length 5.


(Note: 5 was the size of the original qual record, when I tried to set
it to [0,0,0,0,10,1], I get this).

So my question is: Does it make sense to allow the user to modify the
size of the list in letter_annotations['phred_quality'] in qual
sequences? I think this is a nice feature for qual SeqIO.parse. If I
can modify the list size, then I can save the modified version with
SeqIO.write(x,fh,"qual") and have a qual file with a new size.

I am using 1.49 with new files from CVS.



-- 
Sebastián Bassi. Diplomado en Ciencia y Tecnología.

Non standard disclaimer: READ CAREFULLY. By reading this email,
you agree, on behalf of your employer, to release me from all
obligations and waivers arising from any and all NON-NEGOTIATED
agreements, licenses, terms-of-service, shrinkwrap, clickwrap,
browsewrap, confidentiality, non-disclosure, non-compete and
acceptable use policies ("BOGUS AGREEMENTS") that I have
entered into with your employer, its partners, licensors, agents and
assigns, in perpetuity, without prejudice to my ongoing rights and
privileges. You further represent that you have the authority to release
me from any BOGUS AGREEMENTS on behalf of your employer.




More information about the Biopython-dev mailing list