[Bioperl-l] Error when calling remove_gaps
Jason Stajich
jason.stajich at duke.edu
Tue Feb 15 18:25:30 EST 2005
I really hate the way _guess_alphabet works. It completely falls over
with all X, or empty sequences. It needs to not throw when it
encounters a problem like this.
I have overridden it and validate_seq in many of my scripts.
Try putting this at the top of your script before your code, but after
the 'use' statements.
sub Bio::PrimarySeq::_guess_alphabet {
my ($self) = @_;
my $type;
my $str = $self->seq();
# Remove char's that clearly denote ambiguity
$str =~ s/[-.?x]//gi;
my $total = CORE::length($str);
if( $total == 0 ) {
$self->warn("Got a sequence with no letters in it ".
"cannot guess alphabet [$str]");
return 'dna'; # just make dna the default for now
}
my $u = ($str =~ tr/Uu//);
# The assumption here is that most of sequences comprised of
mainly
# ATGC, with some N, will be 'dna' despite the fact that N could
# also be Asparagine
my $atgc = ($str =~ tr/ATGCNatgcn//);
if( ($atgc / $total) > 0.85 ) {
$type = 'dna';
} elsif( (($atgc + $u) / $total) > 0.85 ) {
$type = 'rna';
} else {
$type = 'protein';
}
$self->alphabet($type);
return $type;
}
--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/
On Feb 16, 2005, at 3:32 AM, michael watson ((IAH-C)) wrote:
> Hi
>
> I'm using bioperl-1.4 on Linux. I get the following error after
> calling
> remove_gaps on an alignment I have read in using AlignIO. The
> alignment
> is in fasta format, and some sequences contain "N"'s and "-"'s as gap
> characters, but some sequences do not include any, including the first
> sequence. This problem occurs when I call:
>
> $al->remove_gaps("-")
>
> ------------- EXCEPTION -------------
> MSG: Got a sequence with no letters in - cannot guess alphabet []
> STACK Bio::PrimarySeq::_guess_alphabet
> /usr/local/bioperl-1.4/Bio/PrimarySeq.pm:839
> STACK Bio::PrimarySeq::seq /usr/local/bioperl-1.4/Bio/PrimarySeq.pm:280
> STACK Bio::SimpleAlign::_remove_col
> /usr/local/bioperl-1.4/Bio/SimpleAlign.pm:959
> STACK Bio::SimpleAlign::remove_gaps
> /usr/local/bioperl-1.4/Bio/SimpleAlign.pm:922
> STACK toplevel create_blastable.pl:14
>
> --------------------------------------
>
> Any ideas?
>
> Thanks in advance
>
> Mick
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20050216/40b3efd5/PGP-0001.bin
More information about the Bioperl-l
mailing list