[Bioperl-l] Count amino acid frequency

Stephan Bour sbour at niaid.nih.gov
Fri Jun 27 12:35:27 EDT 2003


I¹m new to the list and bioperl so I hope this is not too stupid a question.

I need to write a perl script that does the following:
- Take a file with about 1000 sequences of the same protein in FASTA format
- For each position on all sequences count the number of occurrence of each
possible residue
- Return only the count of the residues actually present at each position
(in other words, residues present 0 times are not returned).
- Present the data in tab delimited format that could be imported into Excel
for graphing

It is a fairly simple script to write but I try to apply the
do-not-reinvent-the-wheel dogma.

Is there a bioperl module or an existing script that would fit the bill?

Thanks,
Stephan.




More information about the Bioperl-l mailing list