Bioperl: Re: Bio::Tools::Blast
Gordon D. Pusch
pusch@mcs.anl.gov
Thu, 27 Aug 1998 13:17:43 -0500
> -re- non-redundant database-builder, Gordon D. Pusch wrote,
> > Can anyone suggest a more elegant algorithm than the
> > ``stupid-but-simple'' method outlined above ???
>
> As a last resort, I would look into suffix trees, which are very
> nice for such tasks, and have been used in connection w/ the yeast
> database at the MIPS in Munich.
Ummm... Actually, that was my =first= resort (sort of)... :-/
I've built a Berkeley-DB of what we call ``tail tags,'' which is a hash
of lists of IDs keyed by the last 20 aa of each sequence; we use these
for a number of different ``quick lookup'' purposes.
However, one needs to do a substantial amount of processing to boil the
lists of ``same tail-tag IDs'' down to a non-redundant set of sequences,
and there are some peculiarities in the output of my code that cause me
to suspect bugs in my reduction algorithm; hence, my desire to find
something simpler and more elegant...
-- Gordon D. Pusch <pusch@mcs.anl.gov>
Disclaimer: I'm a consultant collaborating with Argonne researchers;
I don't speak for ANL or the DOE --- and they *certainly* don't speak
for =ME= !!!
Claimer: I report =ALL= SPAMvertisers to their ISP --- =NO= exceptions !!!
=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================