[Bioperl-l] Questions on Representing Protein Ambiguity
James Thompson
tex at biosysadmin.com
Sun Oct 3 06:15:04 EDT 2004
Aaron,
Thanks for the feedback. You're definitely right about consensus sequences
being relatively worthless when compared to the information contained in the
whole profile.
Friday afternoon I committed some to ProtMatrix.pm that will allow the regexp
method to take a threshold as an argument, and it's not too hard to change.
The Bio::Tools::dpAlign idea looks interesting, I'd never seen it before
myself. Sometime down the road I'll look into making it use matrices from the
Bio::Matrix::PSM family. Right now I'll work on making sure all of my code is
release-worthy. :)
James Thompson
On Fri, 1 Oct 2004, Aaron J. Mackey wrote:
>
> On Sep 30, 2004, at 10:49 PM, James Thompson wrote:
>
> > An alternative would be to borrow an idea from Perl's regex character
> > classes
> > and represent multiple residues at a position inside of a set of
> > brackets, like
> > this:
> >
> > M[ES]N[IAP]S
>
> In general, you're always going to lose information moving from a
> profile to a flat pattern. This option prevents losing all the
> information that flattening to "MENIS" would (although MENIS is a
> reasonable "consensus" in this case), but there's still information
> loss. So in that sense it isn't really a better solution than "just
> take the most probable residue, unless it's less than some threshold,
> in which case X".
>
> I think the whole idea of a consensus sequence from a profile is a bit
> worthless, to be honest. What are you supposed to be able to do with
> the consensus, search with it? That's what the profile is for in the
> first place ... [ speaking of which, I'd love to see
> Bio::Tools::dpAlign make use of these protein profiles ].
>
> -Aaron
>
> --
> Aaron J. Mackey, Ph.D.
> Dept. of Biology, Goddard 212
> University of Pennsylvania email: amackey at pcbi.upenn.edu
> 415 S. University Avenue office: 215-898-1205
> Philadelphia, PA 19104-6017 fax: 215-746-6697
>
>
More information about the Bioperl-l
mailing list