[DAS] retrieve genes by name

Andreas Kahari ak at ebi.ac.uk
Mon Jul 12 11:19:03 EDT 2004


On Fri, Jul 09, 2004 at 11:39:06PM +0200, Maximilian Haeussler wrote:
> Hi,
> 
> I'm a complete newbie to DAS and couldn't find documentation on this issue, 
> so I hope you can help me:
> 
> 1) In june 03 there was a discussion on this list started by Ethan Cerami 
> (http://portal.open-bio.org/pipermail/das/2003-January/000647.html) about 
> finding a gene by it's (hugo?) name and retrieving the sequence. I didn't 
> completely understand it, but from what I've understood, retrieving a CDS 
> was not that straigforward. Did it get anything easier in the meantime?

No, this is not straight forward.  The 'ens1834cds' source at
das.ensembl.org serves CDS coordinates on Ensmebl peptides, with
contigs as entry points.

So,
http://das.ensembl.org/das/ens1834cds/features?segment=AC105091
will give you things like

      <FEATURE id="ENSP00000317137-2" label="ENSP00000317137">
        <TYPE id="translation">translation</TYPE>
        <METHOD id="ensembl">ensembl</METHOD>
        <START>55008</START>
        <END>55087</END>

        <SCORE>-</SCORE>
        <ORIENTATION>+</ORIENTATION>
        <PHASE>-</PHASE>
        <GROUP id="translation-ENSP00000317137" type="translation" label="ENSP00000317137">
          <LINK href="http://www.ensembl.org/Homo_sapiens/protview?peptide=ENSP00000317137">ProtView</LINK>
        </GROUP>
      </FEATURE>


As far as I'm aware, and the Sanger people would be the ones to
know with certainty, we currently have no DAS server serving
CDS *sequence* directly (even though they they seem to report
"dna/1.0" in the X-DAS-Capabilities HTTP header).

> 2) I am trying to retrieve genes by locuslink/HUGO or any other IDs from 
> biojava and get their 5' sequence. Could you point me to some documentation 
> that describes this task? Of course, the best would be some "biojava in 
> anger"-style cookbook-like recipe on the internet, but any kind of keyword 
> is appreciated. Yes, there is the DAS client in biojava, but it does not 
> seem to support gene names. Or am I off the track here, is DAS simply not 
> meant to support searches like this directly?

First of all, you need a DAS server that understands the IDs
you're trying to use.  I'm a bit unsure wheather DAS is the
right tool here though.  Try something like EnsMart instead
(http://www.ensembl.org/Multi/martview).

For bulk queries, or more complicated stuff, you might want to
look into using the BioMart or Ensembl APIs.  DAS could be, I
think, a bit too simple.  BioMart is discussed on the mart-dev
(http://www.ebi.ac.uk/biomart/contact.html) list, and Ensembl on
the ensembl-dev list (http://www.ensembl.org/Docs/).


Regards,
Andreas

-- 
|[][]| Andreas Kähäri      EMBL, European Bioinformatics Institute
| [] |                     Wellcome Trust Genome Campus
|[][]| Ensembl Developer   Hinxton, Cambridgeshire, CB10 1SD
| [] | DAS Team Leader	   United Kingdom


More information about the DAS mailing list