[Bioperl-l] Building consensus sequence from ESTs

Andrew Macgregor andrew@anatomy.otago.ac.nz
Tue, 17 Sep 2002 16:57:31 +1200


Hi all,

I apologise if this question is a bit basic. Can anyone point me towards
tools within Bioperl that can take multiple ESTs and create a consensus
sequence, something like what can be done with Gelmerge using GCG.


For example, I want to take a bunch of ESTs that each look like this:

> 121549 gnl|UG|Mm#S121549 S100 calcium binding protein A10 (calpactin)
/cds=(68,361) /gb=M16465 /gi=192360 /ug=Mm.1 /len=600
TCGTGGTGTGCCCAGCTCTTCCAAGGACTGCTGCGCTTCGGGGCCCAGGTTTCGACAGAC
TCTTCAAAATGCCATCCCAAATGGAGCACGCCATGGAAACCATGATGCTTACGTTTCACA
GGTTTGCAGGCGACAAAGACCACTTGACAAAGGAGGACCTGAGAGTGCTCATGGAACGGG
AGTTCCCTGGGTTTTTGGAAAATCAAAAGGATCCTCTGGCTGTGGACAAAATAATGAAGG
ACCTGGACCAGTGCCGAGATGGCAAAGTGGGCTTCCAGAGCTTTCTATCACTAGTGGCGG
GGCTCACCATTGCATGCAATGACTATTTTGTAGTAAACATGAAGCAGAAGGGGAAGAAAT
AGGCCAACTGGAGCACTGGTACCCCCACCCTGGTGCGTGTTCACCACGGGGTCACTTGAG
GAATCTGCCCCACTGCTTCTTGTGAGCAGATCAGGACCCTTAGGAAATGTGCAAATGAGA
TCCAACTCCAATTCAACAATCTGAGAGAGAAAACTTAATCCAATGGCAGAGAAGCTTCTG
AGTTTTATATTGTTTGCATCCCATTGCCCTCAATAAAGAAAGTCCTTTTTTTAAGTTCTG


And turn them into something like this (NB: this would be created from a
number of ests and this particular example is not derived from the est
above).


AGGATGTTAGAAACCTGACATTAAAAATAGAGCAAGAAACTCAGAAGCGCTGCCTTACAC
AAAATGACCTGAAGATGCAAACACAACAGGTTAACACACTAAAAATGTCAGAAAAGCAGT
TAAAGCAAGAAAATAACCATCTCATGGAAATGAAAATGAACTTGGAAAAACAAAATGCTG
AACTTCGAAAAGAACGTCAGGATGCAGATGGGCAAATGAAAGAGCTCCAGGATCAGCTCG
AAGCAGAACAGTATTTCTCAACCCTTTATAAAACACAAGTTAGGGAGCTTAAAGAAGAAT
GTGAAGAAAAGACCAAACTTGGTAAAGAATTGCAGCAGAAGAAACAGGAATTACAGGATG
AACGGGACTCTTTGGCTGCCCAACTGGAGATCACCTTGACCAAAGCAGATTCTGAGCAAC
TGGCTCGTTCAATTGCTGAAGAACAATATTCTGATTTGGAAAAAGAGAAGATCATGAAAG
AGCTGGAGATCAAAGAGATGATGGCTAGACACAAACAGGAACTTACGGAAAAAGATGCTA
CAATTGCTTCTCTTGAGGAAACTAATAGGACACTAACTAGTGATGTTGCCAATCTTGCAA
ATGAGAAAGAAGAATTAAATAaCAAATTGAAAGATGTTCAAGAGCAACTGTCAAGATTGA
AAGATGAAGAAATAAGCGCAGCAGCTATTAAAGCACAGTTTGAGAAGCAGCTATTAACAG


I've had a good hunt around the docs but nothing has jumped out at me. Any
help would be appreciated.

Thanks, Andrew.