[Bioperl-l] .qual are like fasta files

Chad Matsalla chad@sausage.usask.ca
Tue, 7 Aug 2001 00:30:55 -0600 (CST)


Hi,

I had a bit of time over the weekend and I wanted to see if I could create
ways to deal with streams of the quality files- both phred files and files
that use a fasta like format.

I found that this could be done in the way most similar to the current
SeqIO-like methods like this:

1. I created a package Bio::PrimaryQual, which was heavily based on
PrimarySeq but was designed to hold arrays of quality values.
PrimaryQual @ISA = qw(Bio::Root::RootI Bio::PrimarySeqI)

2. I created a package package Bio::SeqIO::qual, which contained the
methods to parse the quality values from the fasta-style .qual files. It
was based on Bui::SeqIO::fasta and Bio::SeqIO::qual @ISA = qw(Bio::SeqIO);


3. They can be used (successfully) as follows:

my $qualobj = Bio::PrimaryQual->new ( -seq => '10 20 30 40 50 40 30 20
10',
                            -id  => 'QualityFragment-12',
                            -accession_number => 'X78121',
                            -moltype => 'raw'
                            );

or like this:

my $in_qual  = Bio::SeqIO->new(-file => "<t/qualfile.qual" , '-format' =>
'qual');

print("I saw these in qualfile.qual:\n");
while ( my $qual = $in_qual->next_seq() ) {
        print($qual->display_id()."\n".dumpValue($qual->seq()));
}

This seemed to work find but {seq} was now a reference to an array of
quality values. _much_ easier for me to work with that way.

Could I used the above PrimaryQual.pm object as an integral part of the
Phred.pm module and convert the Phred.pm module to use the SeqIO
(-format 'phred') like this:

my $in_qual  = Bio::SeqIO->new(-file => "<t/qualfile.qual" , '-format' =>
'phred');

to parse them? -format=>'phred' doesn't exist. I would write it. Maybe
this would uniformity of interfaces a lot clearer.

I'm not sure if I am expressing this correctly but I am hoping somebody
can understand what I am trying to get at:

Should I submit the modules I have that are basically standalone,
file-at-a-time modules or should I try to use SeqIO and family as
described in this message?


Who do I ask for a CVS account? Review of the code might be the best way.

Thanks for the advce,

Chad Matsalla