[Bioperl-l] FASTQ support in Biopython, BioPerl, and EMBOSS

Chris Fields cjfields at illinois.edu
Mon Jul 27 13:06:58 UTC 2009


On Jul 27, 2009, at 6:51 AM, Peter wrote:

> On Sat, Jul 25, 2009 at 8:50 PM, Chris Fields<cjfields at illinois.edu>  
> wrote:
>>
>> From this it could be summarized that converting to sanger format  
>> is least
>> problematic, as possible issues may be encountered when converting  
>> to the
>> other variants.  We'll need to fix the solexa quality calculations  
>> in the
>> BioPerl parser as noted in your previous post; I'll work on that.
>>
>
> BioPerl SVN (revision 15887, just updated on the off chance you
> have committed any fixes recently) also has a problem going the
> other way (from FASTQ Sanger to FASTQ Solexa),
>
> $ more sanger_faked.fastq
> @Test PHRED qualities from 40 to 0 inclusive
> ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTN
> +
> IHGFEDCBA@?>=<;:9876543210/.-,+*)('&%$#"!
>
> $ perl bioperl_sanger2solexa.pl < sanger_faked.fastq
> @Test PHRED qualities from 40 to 0 inclusive
> ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTN
> +Test PHRED qualities from 40 to 0 inclusive
> hgfedcba`_^]\[ZYXWVUTSRQPONMLKJHGFEDB@><
>
> Depending on your email viewer this may not be obvious, but
> the sequence line is length 41 but the quality line is only 40
> characters. And again, I also suspect a problem in the mapping
> itself.
>
> Peter

I added this (and the others) to our ticket tracking this.  Looks like  
solexa conversion either way is borked, which is very likely an issue  
with conversion.

chris



More information about the Bioperl-l mailing list