[EMBOSS] about pepstats

Peter Rice pmr at ebi.ac.uk
Mon Nov 10 09:34:53 UTC 2003


Qiang Tu wrote:
> I want to calculate the molecular weight and isoelectric point of all sequences in genbank.

Those are protein properties, so I assume you mean some other database?

> Scripts are too slow and I want to use the programs in EMBOSS directly.

> There are two questions:
> 
> 1. iep supports multi sequences in one file, but pepstats only output the result of the first sequence.
>  Is that true?


Yes - some programs in EMBOSS can work over all sequences in a database,
or all sequences in a file or in a list of sequences. It depends on how
useful such output is (for example, whether any EMBOSS user has asked
for such an extension).

It also depends on how easily a GUI or Web interface can cope with
multiple outputs - they need to automatically match the outputs to each
input sequence.

> 2. how ca I make iep and pepstats output a simple result? I jus want the mw and pI. :-)

That is something we are working on for a future release. Best to wait
for a general solution.

Meanwhile, you can use perl or some other scripts to extract the numbers
you need from the output, and to run over all sequences.

But, if you are programming - you could change the programs (we can
help) to produce the output you want or to read sets of sequences.

regards,

Peter Rice




More information about the EMBOSS mailing list