[Bioperl-l] Sequence qual values

Charles Hauser chauser@duke.edu
23 Sep 2002 11:17:03 -0400


Hi,

As part of an EST project, I would like to trim sequences based on their
qual values . 


Using a window size that is 10% sequence length, I want to progress
along the seq (incrementing the window by 1 nt w/each round) and
calculate the mean qual for the window.  

With the mean qual data I want to trim the 5' and 3' sequences whose
qual values are below a cutoff value (window qual >= 20).

So, in the case below, I would trim the seq to windows 3 <-> 6.

			|........................|

qual:	5	12	30	36	59	21	8	6	

window:	1	2	3	4	5	6	7	8	



Data Formats: (separate files)

Qual data:

>1112026H03.x1 PHD_FILE: 1112026H03.x1.phd.1
8 8 8 8 8 6 6 6 6 6 8 8 8 11 19 12 10 10 11 11 12 12


Seq data format (fasta):

>1112026H03.x1  CHROMAT_FILE: 1112026H03.x1 PHD_FILE:
1112026H03.x1.phd.1 CHEM:
GTCTGCTGAACTACACTACGGTCGAAGGGGAACGGGCCCCCACTCGACAT

Looking through the doc's I see there is a module for reading qual
values (Bio::Seq::PrimaryQual;).

Before diving in, I thought I would check if anyone else has done
something similar, and if so what their approach has been.


regards,

Charles