[Bioperl-l] Fishing redundant sequences in FASTA files [Right formatting]
Jason Stajich
jason at bioperl.org
Tue Feb 15 22:21:34 UTC 2011
also see cd-hit which allows you to tune the %id matching.
Dave Messina wrote:
>> But one nice thing is clustering allows for partial matches (which I think
>> is the original criterion). I don't believe SHA/MD5 would work for that
>> purpose.
>
>
> Yep, for sure. Checksums will find full-length exact matches only.
>
>
> Dave
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
--
Jason Stajich
jason at bioperl.org
http://bioperl.org/wiki
More information about the Bioperl-l
mailing list