[emboss-dev] Bug reports and patches: BAM quality, SAM negative ISIZE

Peter biopython at maubp.freeserve.co.uk
Mon Aug 2 15:52:56 UTC 2010


On Mon, Aug 2, 2010 at 4:42 PM, Peter Rice <pmr at ebi.ac.uk> wrote:
>
> On 02/08/10 14:55, Peter C. wrote:
>
>> In the funny BAM to Sanger FASTQ conversion, EMBOSS has used
>> "]" which is ASCII 93, giving PHRED 93-33 = 60. i.e. 33 more than it
>> should be. I suspected that the EMBOSS code for reading BAM files
>> was wrongly applying a 33 offset to the quality scores. In BAM files
>> the scores are simply encoded directly as uint8_t without any offset.
>
> Thanks for spotting that. We will make a patch with that fix in.
>
>> Looking at the SAM file, I guessed EMBOSS doesn't like a negative
>> ISIZE field in the next record, EAS54_61:4:143:69:578,  .........
>>
>> Looking at the source code, currently EMBOSS is wrongly assuming
>> an unsigned integer will be used. This is not true, the spec allows for
>> a negative ISIZE. I replaced this code in ajax/core/ajseqread.c
>
> Thanks for the fix. We will add that to the patch.
>

Great. Are you still issuing patches which don't affect the version number?
I'd prefer to have an easy way to know if a given install of EMBOSS
has certain fixes, and a point release seems quite straightforward from
an outsider's perspective.

P.S. Expect a couple more reports to follow... so don't rush a patch or
point release out just yet ;)

>> A related question is why did this error condition not give any
>> error message to stdout or stderr?
>
> This appears to be a general issue with reading unknown and known formats.
> We will fix it so that error messages are turned on for this failure
> condition.

Good :)

> Many thanks for the bug reports - and the fixes!!
>

No problem,

Peter




More information about the emboss-dev mailing list