[Bioperl-l] Fishing redundant sequences in FASTA files [Right formatting]
    Jason Stajich 
    jason at bioperl.org
       
    Tue Feb 15 22:21:34 UTC 2011
    
    
  
also see cd-hit which allows you to tune the %id matching.
Dave Messina wrote:
>> But one nice thing is clustering allows for partial matches (which I think
>> is the original criterion).  I don't believe SHA/MD5 would work for that
>> purpose.
>
>
> Yep, for sure. Checksums will find full-length exact matches only.
>
>
> Dave
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
Jason Stajich
jason at bioperl.org
http://bioperl.org/wiki
    
    
More information about the Bioperl-l
mailing list