[Bioperl-l] Fasta Qual files
James Gilbert
jgrg@sanger.ac.uk
Thu, 14 Sep 2000 17:18:23 +0100 (BST)
On Thu, 14 Sep 2000 hilmar.lapp@pharma.Novartis.com wrote:
> > > I have code for reading Phred-produced seq/qual pairs of
> FASTA-formatted
> > > files into Bio::QualSeq objects, which are merely Bio::Seq objects with
> > > quality values (being truncated and reversed, too, when you
> > > truncate/reverse the seq). The code can also write qual-files.
> > >
> > > If you think this is useful for you I'll try to put it to the
> repository.
> >
> > Sounds very useful. Check it in!
>
> Hi,
>
> Sorry if I'm insulting anyone by stating the
> obvious.
>
> I have to deal with quality values for the Sanger
> submissions. It is a good trick to store the
> quality array in memory as a string of unsigned
> chars. Perl arrays 100k long start consuming a
> lot of memory! To do this you use pack and
> unpack:
>
> my @qual = (40,34,35,99,99);
> my $qual_str = pack('C*', @qual);
> @qual = unpack('C*', $qual_str);
>
> James
>
>
> So far I have only dealt with qualvals for reads, which obviously
> don't extend that much. In general, incorporating this shouldn't be
> any problem, but for writing/truncation/reversal the array will still
> have to be unpacked.
Hilmar,
You don't have to unpack for writing/truncation/reversal:
Reversal:
$qual_str = reverse($qual_str);
Truncation:
$qual_str = substr($qual_str, 0, 3);
Writing:
@revised_qual = (41,45,45);
substr($qual_str, 0, 3) = pack('C*', @revised_qual);
> James, if you're already handling quality values, does your code
> already provide what seems to be needed? Wouldn't it make sense that
> you check in your code?
Unfortunately this bit of code doesn't use BioPerl
at all, so I don't have something I can check in.
James
James G.R. Gilbert
The Sanger Centre
Wellcome Trust Genome Campus
Hinxton
Cambridge Tel: 01223 494906
CB10 1SA Fax: 01223 494919