[Bioperl-l] EUtilities - pipeline - Exonic Structure

Steve Chervitz sac at bioperl.org
Wed May 16 22:29:16 UTC 2007


Another option is to use DAS ( http://biodas.org ), which was designed
precisely to solve this sort of problem.

A DAS genome query is a URL that specifies the genome assembly version
on which the returned coordinates should be based. For example, get
all features and their coordinates associated with the human actin
gene on hg17:

http://das.biopackages.net/das/genome/human/17/feature?name=ACTA1

Ensembl, UCSC, and  other sites also provide DAS servers for genomic
features, but these serve up a different XML response format (DAS/1.x)
from what biopackages.net is serving (DAS/2). Here's are some links to
these servers, both DAS/1 and DAS/2:

http://www.biodas.org/wiki/DAS/1#Servers
http://www.biodas.org/wiki/DAS/2#Servers

By default, a DAS/2 server will return data in DAS2XML format, but you
can specify alternative formats if a server supports them. This is one
advantage of the DAS/2 retrieval spec, which is stable and is
described here:

http://biodas.org/documents/das2/das2_get.html

You may not be able to user an Entrez gene ID directly in the query.
It depends on whether these IDs are available on the given server.
Accessions and gene names should be OK. You can always map your Entrez
ids to accessions or gene names using this file
ftp://ftp.ncbi.nih.gov/gene/gene2refseq.gz .

Steve

On 5/16/07, Benoit Ballester <benoit at ebi.ac.uk> wrote:
> Hi Tony,
>
> I don't know how simple it is in bioperl, but it is quite simple using
> the ensembl perl API.
>
> Have a look here :
>
> API instalation:
> http://www.ensembl.org/info/software/api_installation.html
> API tutorial :
> http://www.ensembl.org/info/software/core/core_tutorial.html
> API Perl module Documentation :
> http://www.ensembl.org/info/software/Pdoc/ensembl/index.html
>
> so you can do something similar to the example below :
>
> # Get the 'COG6' gene from human
>
> my $gene = $gene_adaptor->fetch_by_display_label('COG6');
>
> print "GENE ", $gene->stable_id(), "\n";
> # here you get gene coordinate
>
> foreach my $transcript ( @{ $gene->get_all_Transcripts() } ) {
>      print "TRANSCRIPT ", $transcript->stable_id(), "\n";;
>      #print transcript coordinates
>
>         foreach my $exon ( @{ $transcript->get_all_exons() } ) {
>         #print the exon coordinates
>
>         }
>      }
> }
>
> Hope this helps
>
> Benoit
>
>
> Anthony Ferrari wrote:
>  > Hi all,
>  >
>  > I want to do something relatively simple and I want to know how far
> Bioperl
>  > tools could help me because I'm having troubles to get to the point.
>  > Here is the pipeline :
>  >
>  > "EntrezGene Query" ----- (esearch) -----> "Gene ID" ------ (*) ----->
>  > "GeneStructure"
>  >
>  > (*) :
>  >>From the EntrezGene ID, I want to retrieve the structure of the gene
> which
>  > means having the whole genomic sequence and having the start and end
>  > positions of each exons, introns, UTR'....
>  >
>  > I thought of 2 ways to accomplish that :
>  >
>  >   -  use 'efetch', get raw xml or asn1 and then parse it to obtain the
>  > desired positions.
>  >      this method should work but would take a little time to be ok.
>  >
>  >   -  use Bio::DB::EntrezGene module with the "get_Seq_by_id" function. I
>  > obtain a Bio::Seq object but I am not able to find any features stored in
>  > it. So it doesn't seem that the get_Seq_by_id function get all
> information
>  > contained in a EntrezGene entry (?) .
>  >
>  > Can somebody help me to make the right choice or show me the right way?
>  >
>  > I also saw that some packages detinated to deal with  gene structure
> exist
>  > but I don't manage to know how to use it properly and even how to
> create one
>  > of those objects !
>  > Are those packages currently usable ?
>  >
>  >
>  > Thanks in advance.
>  > Best regards,
>  > tony
>  > _______________________________________________
>  > Bioperl-l mailing list
>  > Bioperl-l at lists.open-bio.org
>  > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



More information about the Bioperl-l mailing list