[BioPython] How to use Bio.cluster Module to assembly dna sequences
Bruno Santos
bsantos at biocant.pt
Thu Mar 27 17:33:55 UTC 2008
Hi,
This question is a little bit more generic so I really don't know if anyone
in the mailing may help me.
I have a fasta file with thousands of reads obtained by a sequencing run, in
this fasta file I know I have several copies of the same sequences but their
size and some nucleotides inside it can change. So I need to group them
together using clustering so then I can create a consensus sequence for each
group.
I am trying to achieve this by align all the sequences using clustalw-mpi
and the I run dnadist from phylip to obtain a matrix of distances between
the sequences. Now I need to use clustering to group the sequences based on
these values and for that I am trying to use Bio.cluster to achieve this.
Can anyone help me to choose the clustering method I should use and how can
I submit this kind of data to that method?
Sincerely,
Bruno Santos
More information about the Biopython
mailing list