[Bioperl-l] a stupid question

Sean Davis sdavis2 at mail.nih.gov
Tue Sep 27 06:36:47 EDT 2005


On 9/27/05 3:20 AM, "xuying" <xuying at sibs.ac.cn> wrote:

> Dear All:
>       I have a file containing wanted genes' GeneIDs. How can I map this
> GeneID to its Fasta sequence (genomic sequence of the gene)? Although There is
> position and contig information in the gene2accession file... Is there any
> simple way? Thanks!

You truly want the genomic sequence for a given gene, or do you want the
spliced transcript?  The spliced transcripts are called refseqs in NCBI
parlance and there may be zero to many refseqs for each Gene ID.  There are
a number of ways to get the refseq sequences.  You can use NCBI ftp to get
the mRNA for them, or you can use Ensembl or UCSC to get the genomic
versions of them (the spliced genomic sequence).  If you truly want the
"locus" associated with a given Gene ID, I don't know where that information
is stored in an easily-retrievable, standardized place, but Stefan may be
able to tell you how to get it from the ASN1 files from NCBI.  (Stefan?)

Sean



More information about the Bioperl-l mailing list