[Bioperl-l] Getting genomic coordinates for a list of genes AND WUBlast

Emanuele Osimo e.osimo at gmail.com
Fri Jul 24 00:48:26 UTC 2009

this is the fix:

use Bio::EnsEMBL::Slice;
use Bio::EnsEMBL::Registry;

my $db = new Bio::DB::EntrezGene;

my $registry = 'Bio::EnsEMBL::Registry';
   -host => 'ensembldb.ensembl.org',
   -user => 'anonymous'
my $slice_adaptor = $registry->get_adaptor( 'Human', 'Core', 'Slice' );

my $slice = $slice_adaptor->fetch_by_region( 'chromosome', $chr, $start, $end );
print $slice->seq ;

To be used after getting the coordinates with sub genome_coords .

I have another question for you: I need to use the software WUBlast,
but I noticed that it is no more available on the website. They just
say that if you have it, you can use it. I don't have it, but I
urgently need it, if anyone has it, could you please send it to me?


On Thu, Jul 23, 2009 at 16:33, Mark A. Jensen<maj at fortinbras.us> wrote:
> Excellent, Emanuele-- would you post your fix to the list?
> thanks--MAJ
> ----- Original Message -----
> From: Emanuele Osimo
> To: Mark A. Jensen
> Cc: perl bioperl ml
> Sent: Thursday, July 23, 2009 7:24 PM
> Subject: Re: [Bioperl-l] Getting genomic coordinates for a list of genes
> Hello everyone.
> Today I discovered that the coupling of the two subs that Mark posted
> doesn't get the right results. I think this is because one gets the
> coordinates with RefSeq build 36.3, the other with build 37.
> I found that coupling the first sub, genome_coords, with the
> Bio::EnsEMBL::Registry fetch by region API is a lot better, and it actually
> generates sequences that contain the genes.
> Bye
> Emanuele
> P.S.
> Thanks a lot to Mark!!
> On Thu, Jul 23, 2009 at 16:16, Mark A. Jensen <maj at fortinbras.us> wrote:
>> Sorry, went off-list for a couple cycles. The final product will get the
>> correct chromosomal coordinates and then return the sequence from
>> the current build, based on a geneID input. See
>> http://www.bioperl.org/wiki/Human_genomic_coordinates_and_sequence
>> for the results.
>> cheers MAJ
>> ----- Original Message ----- From: "Emanuele Osimo" <e.osimo at gmail.com>
>> To: "perl bioperl ml" <bioperl-l at lists.open-bio.org>
>> Sent: Friday, July 17, 2009 8:49 AM
>> Subject: [Bioperl-l] Getting genomic coordinates for a list of genes
>>> Hello everyone,
>>> I'm new to programming, I'm a biologist, so please forgive my ignorance,
>>> but
>>> I've been trying this for 2 weeks, now I have to ask you.
>>> I'm trying the script I found at
>>> http://bio.perl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::EntrezGene_to_get_genomic_coordinates
>>> because I need to have some variables (like $from and $to) assigned to
>>> the
>>> start and end of a gene.
>>> The script works fine, but gives me the wrong coordinates: for example if
>>> I
>>> try it with the gene  842 (CASP9), it prints:
>>> NT_004610.19    2498878    2530877
>>> I found out that in Entrez, for each gene (for CASP9, for example, at
>>> http://www.ncbi.nlm.nih.gov/gene/842?ordinalpos=1&itool=EntrezSystem2.PEntrez.Gene.Gene_ResultsPanel.Gene_RVDocSum#refseq
>>> ) under "Genome Reference Consortium Human Build 37 (GRCh37),
>>> Primary_Assembly" there are two different sets of coordinates. The first
>>> is
>>> called "NC_000001.10 Genome Reference Consortium Human Build 37 (GRCh37),
>>> Primary_Assembly", and is the one I need, and the second one is called
>>> just
>>> "NT_004610.19" and it's the one that the script prints.
>>> This is valid for all the genes I tried.
>>> DO you know how to make the script print the "right" coordinates (at
>>> least,
>>> the one I need)?
>>> Thanks a lot in advance,
>>> Emanuele
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l

More information about the Bioperl-l mailing list