[Bioperl-l] Fasta Qual files

James Gilbert jgrg@sanger.ac.uk
Thu, 14 Sep 2000 17:18:23 +0100 (BST)


On Thu, 14 Sep 2000 hilmar.lapp@pharma.Novartis.com wrote:

> > > I have code for reading Phred-produced seq/qual pairs of
> FASTA-formatted
> > > files into Bio::QualSeq objects, which are merely Bio::Seq objects with
> > > quality values (being truncated and reversed, too, when you
> > > truncate/reverse the seq). The code can also write qual-files.
> > >
> > > If you think this is useful for you I'll try to put it to the
> repository.
> >
> > Sounds very useful. Check it in!
> 
> Hi,
> 
> Sorry if I'm insulting anyone by stating the
> obvious.
> 
> I have to deal with quality values for the Sanger
> submissions.  It is a good trick to store the
> quality array in memory as a string of unsigned
> chars.  Perl arrays 100k long start consuming a
> lot of memory!  To do this you use pack and
> unpack:
> 
>   my @qual = (40,34,35,99,99);
>   my $qual_str = pack('C*', @qual);
>   @qual = unpack('C*', $qual_str);
> 
>      James
> 
> 
>      So far I have only dealt with qualvals for reads, which obviously
>      don't extend that much. In general, incorporating this shouldn't be
>      any problem, but for writing/truncation/reversal the array will still
>      have to be unpacked.

Hilmar,

You don't have to unpack for writing/truncation/reversal:

Reversal:

	$qual_str = reverse($qual_str);

Truncation:

	$qual_str = substr($qual_str, 0, 3);

Writing:

	@revised_qual = (41,45,45);
	substr($qual_str, 0, 3) = pack('C*', @revised_qual);

>      James, if you're already handling quality values, does your code
>      already provide what seems to be needed? Wouldn't it make sense that
>      you check in your code?

Unfortunately this bit of code doesn't use BioPerl
at all, so I don't have something I can check in.

	James

James G.R. Gilbert
The Sanger Centre
Wellcome Trust Genome Campus
Hinxton
Cambridge                        Tel: 01223 494906
CB10 1SA                         Fax: 01223 494919