[Bioperl-l] testing for sequence
Heikki Lehvaslaiho
heikki@ebi.ac.uk
Thu, 14 Mar 2002 15:19:41 +0000
Guoneng Zhong wrote:
>
> Hi,
> Is there a way for me to know if a given string looks like a protein or
> dna/rna sequence? Other than doing a grep on all the DNA and Protein
> symbols?
There is an internal method _guess_alphabet @ Bio::PrimarySeqI which is
called when you set the seq() method with your sequence string. It sets the
alphabet into dna/rna/protein depending on ([atgc]u? count / seq_length). If
the ration is above 85% then it is not a protein. This is heuristics but
works in most cases. You can always confuse this and almost any algorithm
with heavy use of ambiguous nucleotide characters.
-Heikki
> G
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
--
______ _/ _/_____________________________________________________
_/ _/ http://www.ebi.ac.uk/mutations/
_/ _/ _/ Heikki Lehvaslaiho heikki@ebi.ac.uk
_/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
_/ _/ _/ Wellcome Trust Genome Campus, Hinxton
_/ _/ _/ Cambs. CB10 1SD, United Kingdom
_/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________