[Bioperl-l] Alphabet guessing
Dmitri Bichko
dbichko at aveopharma.com
Tue Oct 18 03:49:10 EDT 2005
Hi,
Is being unable to guess the sequence alphabet really an unrecoverable
error? I'm referring to this bit in PrimarySeq.pm:
my $str = $self->seq();
$str =~ s/[-.?x]//gi;
my $total = CORE::length($str);
if( $total == 0 ) {
$self->throw("Got a sequence with no letters in it ".
"cannot guess alphabet [$str]");
}
Problem is that if you happen on a seq that's all X's, you get a fatal
exception, which can be very annoying when you are in the middle of a 15
million sequence fasta stream (where you don't care about, nor even
expect the alphabet type; and the docs suggest that you can't
necessarily recover after catching exceptions).
Might not something along these lines make more sense:
if( $total == 0 ) {
$self->warn("Got a sequence with no letters in it, assuming 'dna'
alphabet.");
$self->alphabet('dna');
return 'dna';
}
Or should the seqio factories catch the guessing exceptions?
Thanks,
Dmitri
More information about the Bioperl-l
mailing list