[EMBOSS] ABI to FASTQ with seqret

Peter biopython at maubp.freeserve.co.uk
Thu Jul 22 11:16:48 UTC 2010


On Thu, Apr 22, 2010 at 6:01 PM, Peter Rice <pmr at ebi.ac.uk> wrote:
>
> On 22/04/2010 16:48, Peter Cock wrote:
>
>> Does this mean there is an updated seqret in a public repository where I
>> can convert an ABI file to FASTQ taking the ABI basecaller's sequence
>> and PHRED scores? I'd be interested to test that... or a patch against
>> EMBOSS 6.2.0.
>
> It is in the latest CVS code and will appeart in the July release.
>

Hi Peter R et al,

I've just compiled and installed EMBOSS 6.3.1 on Mac OS X, and had a
go converting some ABI (extension .ab1) files from our in house sequencing
service to FASTQ - so far all the examples give Sanger FASTQ quality strings
of "!" (ASCII 33, PHRED quality zero) or Illumina FASTQ quality strings of
"@" (ASCII 64, again PHRED quality zero).

I remember you saying ABI files can have two sets of quality scores,
so perhaps my files have one set all of PHRED zero?

I tried to find some 3rd party example files via Google, for example on
http://www.elimbio.com/sequencing_sample_files.htm they have a zip
file http://www.elimbio.com/Forms/pGEM.zip containing one ABI file.
The output of this is more interesting:

$ seqret -sformat abi -osformat fastq  -auto -stdout -sequence
pGEM_\(ABI\)_A01.ab1
@pGEM_(ABI)
NANTCTATAGGCGAATTCGAGCTCGGTA...GNN
+
"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"...!"!"!"

I truncated this for brevity. Here the quality string repeats ASCI 34, ASCI 33
(PHRED quality 1, quality 0) which is rather strange. The sequence appears
to agree with the provided file pGEM_(ABI)_A01.seq

Have I just been unlucky with the AB1 files that I have looked at? Thus
far all the quality scores seem meaningless.

Peter C.



More information about the EMBOSS mailing list