[Bioperl-l] How to get from gi/ref/gb to genomic coordinates ?
Rainer Machne
raim at tbi.univie.ac.at
Wed Jan 31 21:09:49 UTC 2007
Dear Bioperl list,
hoping not be on the wrong email list, i would have a short question:
Is there a standard way or are there nice (Bioperl) tools to come from a
gene id (gi) other ids (see below) to the genomic coordinates of the
respective gene?
We have Fasta files retrieved from NCBI protein Blast in fungal genomes:
>gi|46100068|gb|EAK85301.1| hypothetical protein UM04252.1 [Ustilago
maydis 521]
or
>gi|50292953|ref|XP_448909.1| unnamed protein product [Candida glabrata]
(we only have gi, ref and gb in my set).
I retrieved all my fasta files from whole fungal genomes with available
protein sequences at
http://www.ncbi.nlm.nih.gov/sutils/genom_table.cgi?organism=fungi
As I only searched whole finished genomes (not shotgun), I thought it
would then be easy to get the genomic coordinates and retrieve upstream
sequences, but we have failed so far to find a consistent way to do this
automatically. Many of the gi entries refer to mRNAs or partial mRNAs
and the way to the coordinates seems to differ for each case.
Any suggestions would be appreciated.
with kind regards,
Rainer Machne
University of Vienna
Department for Theoretical Chemistry
Theoretical Biochemistry Group
More information about the Bioperl-l
mailing list