[Bioperl-l] Questions on Representing Protein Ambiguity
Aaron J. Mackey
amackey at pcbi.upenn.edu
Fri Oct 1 16:04:29 EDT 2004
On Sep 30, 2004, at 10:49 PM, James Thompson wrote:
> An alternative would be to borrow an idea from Perl's regex character
> classes
> and represent multiple residues at a position inside of a set of
> brackets, like
> this:
>
> M[ES]N[IAP]S
In general, you're always going to lose information moving from a
profile to a flat pattern. This option prevents losing all the
information that flattening to "MENIS" would (although MENIS is a
reasonable "consensus" in this case), but there's still information
loss. So in that sense it isn't really a better solution than "just
take the most probable residue, unless it's less than some threshold,
in which case X".
I think the whole idea of a consensus sequence from a profile is a bit
worthless, to be honest. What are you supposed to be able to do with
the consensus, search with it? That's what the profile is for in the
first place ... [ speaking of which, I'd love to see
Bio::Tools::dpAlign make use of these protein profiles ].
-Aaron
--
Aaron J. Mackey, Ph.D.
Dept. of Biology, Goddard 212
University of Pennsylvania email: amackey at pcbi.upenn.edu
415 S. University Avenue office: 215-898-1205
Philadelphia, PA 19104-6017 fax: 215-746-6697
More information about the Bioperl-l
mailing list