[Bioperl-l] How to extract promoter region seq from genbank or another source?

Sean Davis sdavis2 at mail.nih.gov
Fri Jan 6 06:54:00 EST 2006


One way to do this is to map your genbank accession to Unigene and then map
that to Entrez Gene.  The simplest way to do this is to use Stanford Source
( http://smd.stanford.edu/cgi-bin/source/sourceBatchSearch).  Just past in
your list of accessions, choose to map from Genbank Accession, and then
choose what you want to map to (LocusLink ID, in this case).  From there,
you can use the LocusLink IDs (now called Entrez Gene ID) to search in
Ensembl or TRASER for upstream sequences.

The other alternative is to use the UCSC genome table browser directly (
http://genome.ucsc.edu/cgi-bin/hgTables).  Choose your organism of interest.
Choose group "mRNA and EST tracks", table "all_mrna" and region "genome".
Click "Paste List" and paste in your list of accessions and submit it.  Then
choose output format "sequence" and "plain text" output.  Then choose "get
output".  On the next page, you can see options for output.  Choose what you
like.  

Hope this helps and keeps things simple.


On 1/6/06 3:05 AM, "Ewan Birney" <birney at ebi.ac.uk> wrote:

> 
> 
> hz5 at njit.edu wrote:
>> http://siriusb.umdnj.edu:18080/EZRetrieve/index.jsp
>> 
>> Quoting Stefan Kirov <skirov at utk.edu>:
>> 
>>> Sam,
>>> You can use MART to convert to ensembl id (in most cases). I don't think
>>> 
>>> they support genebank. You can try to use genekeydb
>>> (genereg.ornl.gov/gkdb), either download it or use the online converter,
>>> 
> 
> 
> I know this is rather old, but at Ensembl we do, of course, track GenBank
> accession numbers - these are the identifiers shared with EMBL. We don't
> track GenBank gi numbers as they are too volatile.
> 
> 
>>> but my guess is you are not going to get too many ids. One thing I may
>>> 
>>> fix in the future, but right now... Still may be worth a try. Look at
>>> seqhound too (http://www.blueprint.org/seqhound/index.html).
>>> Stefan
>>> 
>>> Brian Osborne wrote:
>>> 
>>>> ENSEMBL experts?
>>>> 
>>>> ------ Forwarded Message
>>>> From: Sam Al-Droubi <saldroubi at yahoo.com>
>>>> Date: Fri, 14 Oct 2005 14:05:38 -0700 (PDT)
>>>> To: Brian Osborne <brian_osborne at cognia.com>
>>>> Subject: Re: [Bioperl-l] How to extract promoter region seq from
>>> genbank or
>>>> another source?
>>>> 
>>>> Hi Brian,
>>>> 
>>>> Thank you for the response.  I looked at it but it seems that enembl
>>> does
>>>> not use accession numbers.   It seems that they have their own
>>> numbering
>>>> scheme.  If so how do I get the mapping between the two.  If I can't
>>> get the
>>>> promoter region sequence then do you know if there is a way I can get
>>> the
>>>> entire chromosome sequence?  If so, I can then try to find the gene
>>> within
>>>> it and then grab the promoter region.
>>>> I am new to all this so I am sorry if I sound ignorant in this area.
>>>> 
>>>> On the surface, it seems that one should be able to do this easily but
>>> it
>>>> has not been easy so far.
>>>> 
>>>> Thank you. 
>>>> 
>>>> 
>>>> Brian Osborne <brian_osborne at cognia.com> wrote:
>>>>  
>>>> 
>>>>> Sam,
>>>>> 
>>>>> ensembl may be one solution, I think it provides a good API for these
>>> sorts
>>>>> of queries. See the ensembl API documentation for more information
>>>>> (http://www.ensembl.org/info/software/core/core_tutorial.html).
>>>>> 
>>>>> Brian O.
>>>>> 
>>>>> 
>>>>> 
>>>>> On 10/13/05 11:25 AM, "Sam Al-Droubi" wrote:
>>>>> 
>>>>>    
>>>>> 
>>>>>>> Hello,
>>>>>>> 
>>>>>>> I am totally new to BioPerl. I was able to install it and retrieve
>>> data
>>>>>>>        
>>>>>>> 
>>>>>> from
>>>>>>      
>>>>>> 
>>>>>>> GenBank. I have a list of accession numbers for genes but I want to
>>> use
>>>>>>> BioPerl to get the promoter region (1000 bp before the start of the
>>> gene).
>>>>>>> Can someone point me in the right direction on how to accomplish
>>> this.
>>>>>>> Tech info: Using bioperl-1.5 on SuSE 9.3 professional machine.
>>>>>>> 
>>>>>>> Thank you.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Sincerely, 
>>>>>>> Sam Al-Droubi, M.S.
>>>>>>> saldroubi at yahoo.com
>>>>>>> _______________________________________________
>>>>>>> Bioperl-l mailing list
>>>>>>> Bioperl-l at portal.open-bio.org
>>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>        
>>>>>>> 
>>>>>    
>>>>> 
>>>> 
>>>> Sincerely, 
>>>> Sam Al-Droubi, M.S.
>>>> saldroubi at yahoo.com
>>>> 
>>>> ------ End of Forwarded Message
>>>> 
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at portal.open-bio.org
>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>  
>>>> 
>>> -- 
>>> Stefan Kirov, Ph.D.
>>> University of Tennessee/Oak Ridge National Laboratory
>>> 5700 bldg, PO BOX 2008 MS6164
>>> Oak Ridge TN 37831-6164
>>> USA
>>> tel +865 576 5120
>>> fax +865-576-5332
>>> e-mail: skirov at utk.edu
>>> sao at ornl.gov
>>> 
>>> "And the wars go on with brainwashed pride
>>> For the love of God and our human rights
>>> And all these things are swept aside"
>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>> 
>> 
>> 
>> 
>> =========================================================
>> Haibo Zhang, PhD
>> Computational Biology
>> http://www.cyberpostdoc.org/
>> Share postdoc information in cyberspace. Welcome your stories, suggestions
>> and 
>> advice!
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list