[Bioperl-l] ORF identification/prediction
Fernan Aguero
fernan@iib.unsam.edu.ar
Mon, 8 Jan 2001 19:58:17 -0300
Currently I am calling getorf (from the EMBOSS package) in my scripts to
do this for me.
[fernan@iib4 fernan]$ getorf -h
Mandatory qualifiers:
[-sequence] seqall Sequence database USA
[-outseq] seqoutall Output sequence(s) USA
Optional qualifiers:
-table list Code to use
-minsize integer Minimum nucleotide size of ORF to report
-find list This is a small menu of possible output
options. The first four options are to
select either the protein translation or
the
original nucleic acid sequence of the
open
reading frame. There are two possible
definitions of an open reading frame: it
can
either be a region that is free of STOP
codons or a region that begins with a
START
codon and ends with a STOP codon. The
last
three options are probably only of
interest
to people who wish to investigate the
statistical properties of the regions
around
potential START or STOP codons. The last
option assumes that ORF lengths are
calculated between two STOP codons.
Advanced qualifiers:
-[no]methionine bool START codons at the beginning of protein
products will usually code for
Methionine,
despite what the codon will code for when
it
is internal to a protein. This qualifier
sets all such START codons to code for
Methionine by default.
-circular bool Is the sequence circular
-[no]reverse bool Set this to be false if you do not wish
to
find ORFs in the reverse complement of
the sequence.
-flanking integer If you have chosen one of the options of
the
type of sequence to find that gives the
flanking sequence around a STOP or START
codon, this allows you to set the number
of
nucleotides either side of that codon to
output. If the region of flanking
nucleotides crosses the start or end of
the
sequence, no output is given for this
codon.
What i find annoying about EMBOSS apps is that the -h (-help) option
prints limited information (unless the options are 'boolean' or
'integer' you don't know what to put there). You have to go to EMBOSS
web site to look for extended help!
Hope this helps,
Fernan
On Mon, 08 Jan 2001 18:10:26 Jason Stajich wrote:
> To the best of my knowledge, we don't currently have bioperl modules
> that
> predict/identify (depending on your confidence in the software =) Open
> Reading Frames. Eric and I were thinking of working on a bioperl
> module
> for this. Any suggestions, known pitfalls, etc are welcomed.
>
>
> Jason Stajich
> jason@chg.mc.duke.edu
> Center for Human Genetics
> Duke University Medical Center
> http://www.chg.duke.edu/
--
# --------------------------------------------------------- #
# _ #
# Fernan Aguero | / \ #
# Bioinformatics | ASCII \ / against #
# IIB-UNSAM | ribbon / HTML #
# fernan@iib.unsam.edu.ar | campaign / \ email #
# ICQ 100325972 | / \ #
# #
# --------------------------------------------------------- #