[Biojava-l] DAS client: how to retrieve features for a sequence region

Thu Apr 29 06:30:03 UTC 2010

Hi Jonathan,

Just to clarify, I need to write my own das client? I was hoping to be able
to use most of the functionality especially for the parsing of the XML and
creating the URLs by means of functions/methods that are already around. 

I am now going into debug mode for the DAS package in biojava to look for
the XML parsing, if you any further pointers on specific methods I should be
looking at it would mean a lot to me.

In short, I think I can create the URLs from scratch with not much effort. I
don't currently know how to put the XML into a data structure and how this
data structure should look like.

Thanks for your kind help,

Bernd

  _____  

From: Jonathan Warren [mailto:jw12 at sanger.ac.uk] 
Sent: Wednesday, April 28, 2010 10:21 PM
To: Bernd Jagla
Cc: biojava-l at lists.open-bio.org
Subject: Re: [Biojava-l] DAS client: how to retrieve features for a sequence
region

Hi Bernd

For the UCSC you need to filter on types. see
http://genome.ucsc.edu/FAQ/FAQdownloads.html#downloads there is a section
called "Downloading data from the UCSC DAS server"

for DAS libraries you can see a tutorial here
http://www.biodas.org/wiki/DASWorkshop2010#Day_2

the one you would be most interested in is the Dasobert tutorial
(http://www.ebi.ac.uk/~rafael/dokuwiki/doku.php?id=das:courses:dasobert) for
DAS client creation, but there is a also a good javascript library as well
called JSDas.

Any more info then don't hesitate to ask.

Jonathan.

On 28 Apr 2010, at 08:25, Bernd Jagla wrote:

Hi there,         

I am trying to retrieve information (features) from the UCSC genome browser
using the DAS interface. 
I am looking at the org.biojava.bio.program.das sources. I can retrieve all
top level entry points with 
DASSequenceDB(dbURL)
(Apperently the last entry from the return XML object gives a 
[Fatal Error] :1:1: Content is not allowed in prolog.
Which I am ignoring...)

and also the DSN entries using:
DAS das = new DAS();
   das.addDasURL(new URL(dbURLString));
   for(Iterator i = das.getReferenceServers().iterator(); i.hasNext(); )
{....

When I try to access features for a top level entry point, i.e. a reference
sequence I have the impression that first all features for a given reference
sequence are being downloaded. 

My questions: 

How can I access only the features of a specific region? I guess in DAS
terms I want to specify the segment part of the URL
(http://genome.ucsc.edu/cgi-bin/das/hg17/features?segment=22:15000000,160000
00).

I would also like to get the list of available features. How can I achieve
this? From a wireshark output I can see that this is being retrieved somehow
behind the scene. How can I access this information?

I am looking at TestDAS*.java; are there any other examples around that I
can use to learn from?

Thanks a lot for your kind support,

Best,

Bernd

_______________________________________________
Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biojava-l

Jonathan Warren

Senior Developer and DAS coordinator

jw12 at sanger.ac.uk

Ext: 2314

Telephone: 01223 492314

-- The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a company
registered in England with number 2742969, whose registered office is 215
Euston Road, London, NW1 2BE.