[Bioperl-l] Count amino acid frequency

Brian Osborne brian_osborne at cognia.com
Fri Jun 27 15:10:02 EDT 2003


Stephan,

OK, simple. If the sequences weren't of equal length and it was essential to
account for each sequence then I'd say you'd have to make an alignment using
your file as input (/Bio/Tools/Run/Alignment/Clustalw.pm), and then you
could slice the alignment into columns with Bio/SimpleAlign::slice and
analyze each column with Bio/Tools/SeqStats. In fact, it still may be easier
to do this than take each sequence, split it into an array, and so on. There
may be other approaches of course, and I'm not sure about the details, this
is what my first try would be. But you should probably just wait one minute,
Jason will probably write this application for you...

;-)

Brian O.

-----Original Message-----
From: bioperl-l-bounces at portal.open-bio.org
[mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Stephan Bour
Sent: Friday, June 27, 2003 12:30 PM
To: bioperl-l at portal.open-bio.org
Subject: Re: [Bioperl-l] Count amino acid frequency

Good question. There should be an array length test to eliminate any
sequence that's not full length (81 aa) or don't start with a methionine.
Stephan.

> Stephan,
>
> Are all of your protein variants guaranteed to be of the same length?
>
> Brian O.
>
> -----Original Message-----
> From: bioperl-l-bounces at portal.open-bio.org
> [mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Stephan Bour
> Sent: Friday, June 27, 2003 11:35 AM
> To: bioperl-l at portal.open-bio.org
> Subject: [Bioperl-l] Count amino acid frequency
>
> I¹m new to the list and bioperl so I hope this is not too stupid a
question.
>
> I need to write a perl script that does the following:
> - Take a file with about 1000 sequences of the same protein in FASTA
format
> - For each position on all sequences count the number of occurrence of
each
> possible residue
> - Return only the count of the residues actually present at each position
> (in other words, residues present 0 times are not returned).
> - Present the data in tab delimited format that could be imported into
Excel
> for graphing
>
> It is a fairly simple script to write but I try to apply the
> do-not-reinvent-the-wheel dogma.
>
> Is there a bioperl module or an existing script that would fit the bill?
>
> Thanks,
> Stephan.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>


_______________________________________________
Bioperl-l mailing list
Bioperl-l at portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list