[Bioperl-l] Next-Gen and the next point release - updates

Chris Fields cjfields at illinois.edu
Tue Sep 1 16:05:14 UTC 2009


On Sep 1, 2009, at 10:33 AM, Peter wrote:

> On Thu, Aug 27, 2009 at 12:55 PM, Peter wrote:
>>> The two conversions to solexa are still failing.  I'm not sure but  
>>> I think
>>> it's something fairly simple, but I can't work on it until Friday  
>>> (got too
>>> many other things on my plate ATM).  If I get stumped I'll post a  
>>> message.
>>
>> ...
>>
>> This should narrow it down - the bug is in mapping PHRED
>> scores (from either Sanger or Illumina 1.3+ files) to the
>> Solexa encoding.
>>
>> Peter
>
> Hi Chris,
>
> I've just noticed BioPerl is treating invalid characters in the  
> quality
> string as a warning condition (not an error):
> http://lists.open-bio.org/pipermail/open-bio-l/2009-September/000568.html
>
> It seems for fastq-sanger and fastq-illumina, these get given PHRED 0
> (character "!" or "@" respectively) which is reasonable. For fastq- 
> solexa
> to fastq-solexa however, Solexa -5 (ASCII 59, character ";") does  
> not get
> used - a bug?
>
> Also, in all these cases there is currently a spurious "data loss"  
> warning:
>
> $ ./bioperl_sanger2sanger.pl < error_qual_null.fastq
>
> --------------------- WARNING ---------------------
> MSG: Unknown symbol with ASCII value 0 outside of quality range,
> ---------------------------------------------------
>
> --------------------- WARNING ---------------------
> MSG: Data loss for sanger: following values exceed max 93
>
> ---------------------------------------------------
> @SLXA-B3_649_FC8437_R1_1_1_850_123
> GAGGGTGTTGATCATGATGATGGCG
> +
> YYY!YYYYYYYYYWYYWYYSYYYSY
> @SLXA-B3_649_FC8437_R1_1_1_397_389
> GGTTTGAGAAAGAGAAATGAGATAA
> +
> YYYYYYYYYWYYYYWWYYYWYWYWW
> @SLXA-B3_649_FC8437_R1_1_1_850_123
> GAGGGTGTTGATCATGATGATGGCG
> +
> YYYYYYYYYYYYYWYYWYYSYYYSY
> @SLXA-B3_649_FC8437_R1_1_1_362_549
> GGAAACAAAGTTTTTCTCAACATAG
> +
> YYYYYYYYYYYYYYYYYYWWWWYWY
> @SLXA-B3_649_FC8437_R1_1_1_183_714
> GTATTATTTAATGGCATACACTCAA
> +
> YYYYYYYYYYWYYYYWYWWUWWWQQ
>
> Regards,
>
> Peter

Right, per off-list discussion this can be changed (I would rather it  
die there anyway).

chris




More information about the Bioperl-l mailing list