[Bioperl-l] genome position mapping of RefSeq IDs

Robert Citek robert.citek at gmail.com
Wed Feb 25 20:40:00 UTC 2009


Hello all,

I have a list of RefSeq IDs for which I can parse out all the
annotation (e.g. exons, SNPs, etc.).  For this one project, I need the
same coordinate information relative to the genome rather than the
transcript.  Is such mapping information available?  Or are pieces
available so that I can string them together?

A simple use case would be for me to query a dataset with a RefSeq and
it will return the genomic coordinates of all introns.

I've looked at the mapping information at
ftp://ftp.ncbi.nih.gov/gene/DATA, which gets me close but seems to be
missing some parts.  Or is that what I'm looking for and I just don't
see how the pieces fit?

Thanks in advance for any pointers in the right direction.

Regards,
- Robert

On Wed, Oct 22, 2008 at 2:54 PM, Chris Fields <cjfields at illinois.edu> wrote:
> You can 'epost' in increments if you have more IDs, up to 1000-2000 I think.
>  Beyond that, you should probably use one of the mapping files located in
> the ftp.ncbi.nih.gov/gene/DATA folder and just use it locally (initially
> index the data with DB_File, search using a tied hash, etc).



More information about the Bioperl-l mailing list