[Bioperl-l] ortholog identification for many species

Brian Osborne brian_osborne@cognia.com
Fri, 5 Jul 2002 08:31:13 -0400


Martin,

Some would argue, and I've heard it argued, that orthologs are those pairs
of genes that have arisen from a single gene in the ancestral genome. What
this suggests is that one finds orthologs by aligning syntenic regions from
the genomes in question and then attempting to find the orthologs by their
map position, as well as their similarity. Of course, this approach will
fail with genomes sufficiently distant in evolutionary terms that any
synteny is undetectable (though perhaps transitivity will help: if A and B
are orthologues and B and C are orthologues, then so are A and C). This
approach would be aided by alignment software designed to handle
genome-scale nucleotide alignments, and I've read about alignment programs
like this (e.g. blastz, though I don't think any are integrated into
Bioperl, some one will correct me if I'm wrong).

What's encouraging is that this strict approach will become easier as
genomes are finished and more and more genes are placed precisely on the
finished sequences. However, there are formal problems that will arise as
one attempts to address more and more genomes (gene duplication, gene
deletion or mutational inactivation, rearrangement, etc., one could probably
write a book on this topic).

Brian O.

PS When are genes orthologues? To answer this question we first need to
agree on the definition of orthologues. From an evolutionary perspective,
orthologues are a pair of genes, one in each species, that are descended
from a single gene in the last common ancestor of those two species. Thus,
orthologue assignments are purely historical and do not involve function.
Functions of genes may change during evolution as lineal descendants acquire
new functions or lose old functions.

>From http://zfin.org/zf_info/monitor/vol7.1/vol7.1.html#TO ORTHOLOGUE OR NOT
TO ORTHOLOGUE, THAT IS THE QUESTION

-----Original Message-----
From: bioperl-l-admin@bioperl.org [mailto:bioperl-l-admin@bioperl.org]On
Behalf Of Martin Lercher
Sent: Friday, July 05, 2002 6:29 AM
To: bioperl-l@bioperl.org
Subject: [Bioperl-l] ortholog identification for many species

Hi,

I have to identify a large number of orthologs for a dozen or so species.
There was a discussion on the identification of orthologs here some months
ago (and  I got some information from Elia on the approach they took for the
Fugu project); however, my impression is that the methods discussed there
were more useful for a 2-species problem.

My feeling is that to identify orthologs in many species, one has to draw a
phylogenetic tree for all homologous sequences, and look for clusters of
genes that contain exactly one gene for each species (or, if more than one
gene for one species, these should be clustered together). Has anything
similar to this approach been implemented in Bioperl (or at all)? Or would
you suggest a different approach?

Cheers,
Martin
________________________

Martin Lercher, Ph.D.
Department of Biology and Biochemistry
University of Bath
Claverton Down
Bath,  Somerset
BA2 7AY, UK
Tel. +49-178-2573652
Fax +44-1225-386779
email: m.j.lercher@bath.ac.uk

_______________________________________________
Bioperl-l mailing list
Bioperl-l@bioperl.org
http://bioperl.org/mailman/listinfo/bioperl-l