[Bioperl-l] How to get from gi/ref/gb to genomic coordinates ?
Jason Stajich
jason at bioperl.org
Thu Feb 1 18:36:02 UTC 2007
On Feb 1, 2007, at 9:55 AM, Chris Fields wrote:
>
> On Feb 1, 2007, at 6:54 AM, Rainer Machne wrote:
>
>> Barry and Jason,
>>
>> thanks for your quick and very helpful replies.
>>
>> I guess we should have done (or repeat) our blast search at
>> http://fungal.genome.duke.edu/
>> to get better mapping from proteins to genomes ?
>>
Well I'm not quite sure of your exact goals. To find upstream
regions of known genes, or look at upstream regions of orthologous
genes?
You can first figure out orthologs based on protein similarities,
then go in an extract upstream regions for the orthologous genes (I
provide a link to a big all-vs-all FASTA result at the bottom of the
page if you want those results, as well as some pairiwise orthology
assignments, although you may want more or less stringent parameters).
All the GFF and AA data is freely available for download on the site
for each genome we've annotated or for annotation we've re-formatted
so you can do things locally and/or modify it to your liking.
>> As I retrieved all my proteins via whole genome blasts we should find
>> (most of) them in the genbank files ... a good opportunity for me to
>> learn some Bioperl and the other packages you mentioned in case we
>> want
>> to do more complex analysis later :-)
>>
>> Thank you very much!
>>
>> Rainer
>
> If the data is available in GenBank you could run the BLAST
> searches at NCBI and limit the search with an Entrez query:
>
> http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query
>
> Most (all?) genome files are tagged as complete
>
> I'm not sure but there might be a way of doing this via
> Bio::Tools::Run::RemoteBlast. Jason, any ideas?
>
> chris
--
Jason Stajich
Miller Research Fellow
University of California, Berkeley
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html
http://fungalgenomes.org/
More information about the Bioperl-l
mailing list