Hi, I have been trying to develop a divergent sequence data set for a phylogenetic analysis. Do we have something in Biopython, where for a given set of sequences we can choose identity threshold to reduce redundancy in the dataset. Cheers, Animesh