[EMBOSS] ABI to FASTQ with seqret

Peter biopython at maubp.freeserve.co.uk
Thu Jul 22 13:13:46 UTC 2010


On Thu, Jul 22, 2010 at 1:28 PM, Peter Rice <pmr at ebi.ac.uk> wrote:
>
> On 22/07/10 12:22, Peter C. wrote:
>
>>> I truncated this for brevity. Here the quality string repeats ASCI 34, ASCI 33
>>> (PHRED quality 1, quality 0) which is rather strange. The sequence appears
>>> to agree with the provided file pGEM_(ABI)_A01.seq
>>>
>>> Have I just been unlucky with the AB1 files that I have looked at? Thus
>>> far all the quality scores seem meaningless.
>
> There are two sets of quality scores in that file. Both are the
> alternating characters 1 and 0. Adding 33 gives the scores you see.
>
> Looks as though EMBOSS is just reporting what it finds.
>
> The file offset is the value returned by function
> ajSeqABIGetConfidOffset. It simply reads one byte from there for each
> base of sequence length.

Looks like that particular random example from the internet was just odd.

>> I went back through my old emails, and see you had been testing with
>> http://www.appliedbiosystems.com/support/software_community/ab1_files.zip
>> (I had trouble downloading this with curl - Firefox worked). Looking at these
>> ABI files with seqret as FASTQ does seem to give meaningful quality scores.
>> Curious.
>
> It should look for a PCON tag in the file and pick up the second of two,
> or the first if there is only one.
>
> Can anyone on the list enlighten us further on what is intended for the
> quality socrss in these example files?

The gGEM example I have no idea - I just found it with Google.

I can send you a couple of our locally produced AB1 files off list
if you wouldn't mind having a look at them. It may be that however
these are being generated there simply are no useful scores inside.

Peter



More information about the EMBOSS mailing list