[Bioperl-l] Protein families
brunovecchi at yahoo.com.ar
Wed Feb 11 21:02:09 UTC 2009
This question is somewhat unrelated to Bioperl technical issues, but I
hope I can get some answers.
What would be a sane way to address whether a sequence is part of a
family? Since it's too broad of an issue, I'll restrict it:
- It doesn't have to use online services.
- It has to be scriptable.
- It has to rely only on the aminoacidic sequence (ie, no
experimental evidence, including 3D structure).
- If possible, it should be fast.
- For extra points, it should be simple (or complicated, but have a
The context is this: I want to perform some GA randomization on a
protein sequence to optimize for an arbitrary target function (for
instance, increase occurrence of certain type of proteolytic enzymes) , but I
also want to minimize the chance of losing the protein's original
function. So I thought that I'd need some sort of quantitative measure
of how close the sequence is to belonging to the original's family.
The simplest way that I can think of for doing this is to first
build a profile for the family, based on a multiple sequence
alignment; then to align each random sequence against the profile and
calculate an e-value. But since I don't know much about this things, I
really can't judge whether it makes sense or is completely wrong.
Using Bio::Tools::HMM sounded fine, but unfortunately it doesn't offer
a method for calculating the probability of an observation sequence,
given the profile.
What would you suggest? Thanks in advance!
PS: If there is a more appropriate mailing list for this sort of
questions, please don't hesitate to educate me.
Recetas prácticas y comida saludable
More information about the Bioperl-l