[Biojava-l] DAS client: how to retrieve features for a sequence region

Thu Apr 29 09:57:58 UTC 2010

Great that is very helpful. One more question: Should I be using the Das1 or
Das2 implementations. The demo I am looking at uses Das2 (I think), but I am
running into problems. By modifying things in the Das2SourceHandler I can
now get Ids (instead of using uri). Is this the right way of approaching
this or should I be looking somewhere else..

When you say I have to play around with the URLs can you give me an example?
Is the problem described above part of this? (this is not the URL but rather
the XML..)

Sorry for these questions, but I find it extremely difficult to get my head
around all these different versions (DAS1/2; dasobert/programs.das;
European/Rest;.)

Thanks a lot,

Bernd

PS. I guess I should have attended the recent meeting. ;(

  _____  

From: Jonathan Warren [mailto:jw12 at sanger.ac.uk] 
Sent: Thursday, April 29, 2010 10:27 AM
To: Bernd Jagla
Cc: biojava-l at lists.open-bio.org
Subject: Re: [Biojava-l] DAS client: how to retrieve features for a sequence
region

The link I gave you
http://www.ebi.ac.uk/~rafael/dokuwiki/doku.php?id=das:courses:dasobert shows
examples of how to connect to 'European' style das sources. For the UCSC and
GBrowse type DAS sources you may have to play around with the urls to get
the info you want as they work slightly differently to other DAS data
sources and use the types to filter data. I would suggest contacting the
UCSC for more info.

The dasobert library is what you should use- the DASSequenceDB.java that you
are currently looking at in biojava are old and not really supported
anymore.

I was hoping to be able to use most of the functionality especially for the
parsing of the XML and creating the URLs by means of functions/methods that
are already around.

this is what the dasobert library is for ;)

On 29 Apr 2010, at 07:30, Bernd Jagla wrote:

Hi Jonathan,

Just to clarify, I need to write my own das client? I was hoping to be able
to use most of the functionality especially for the parsing of the XML and
creating the URLs by means of functions/methods that are already around.

I am now going into debug mode for the DAS package in biojava to look for
the XML parsing, if you any further pointers on specific methods I should be
looking at it would mean a lot to me.

In short, I think I can create the URLs from scratch with not much effort. I
don't currently know how to put the XML into a data structure and how this
data structure should look like.

Thanks for your kind help,

Bernd

  _____  

From: Jonathan Warren [mailto:jw12 at sanger.ac.uk] 
Sent: Wednesday, April 28, 2010 10:21 PM
To: Bernd Jagla
Cc: biojava-l at lists.open-bio.org
Subject: Re: [Biojava-l] DAS client: how to retrieve features for a sequence
region

Hi Bernd

For the UCSC you need to filter on types. see
http://genome.ucsc.edu/FAQ/FAQdownloads.html#downloads there is a section
called "Downloading data from the UCSC DAS server"

for DAS libraries you can see a tutorial here
http://www.biodas.org/wiki/DASWorkshop2010#Day_2

the one you would be most interested in is the Dasobert tutorial
(http://www.ebi.ac.uk/~rafael/dokuwiki/doku.php?id=das:courses:dasobert) for
DAS client creation, but there is a also a good javascript library as well
called JSDas.

Any more info then don't hesitate to ask.

Jonathan.

On 28 Apr 2010, at 08:25, Bernd Jagla wrote:

Hi there,         

I am trying to retrieve information (features) from the UCSC genome browser
using the DAS interface. 
I am looking at the org.biojava.bio.program.das sources. I can retrieve all
top level entry points with 
DASSequenceDB(dbURL)
(Apperently the last entry from the return XML object gives a 
[Fatal Error] :1:1: Content is not allowed in prolog.
Which I am ignoring...)

and also the DSN entries using:
DAS das = new DAS();
   das.addDasURL(new URL(dbURLString));
   for(Iterator i = das.getReferenceServers().iterator(); i.hasNext(); )
{....

When I try to access features for a top level entry point, i.e. a reference
sequence I have the impression that first all features for a given reference
sequence are being downloaded. 

My questions: 

How can I access only the features of a specific region? I guess in DAS
terms I want to specify the segment part of the URL
(http://genome.ucsc.edu/cgi-bin/das/hg17/features?segment=22:15000000,160000
00).

I would also like to get the list of available features. How can I achieve
this? From a wireshark output I can see that this is being retrieved somehow
behind the scene. How can I access this information?

I am looking at TestDAS*.java; are there any other examples around that I
can use to learn from?

Thanks a lot for your kind support,

Best,

Bernd

_______________________________________________
Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biojava-l

Jonathan Warren

Senior Developer and DAS coordinator

jw12 at sanger.ac.uk

Ext: 2314

Telephone: 01223 492314

-- The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a company
registered in England with number 2742969, whose registered office is 215
Euston Road, London, NW1 2BE.

Jonathan Warren

Senior Developer and DAS coordinator

jw12 at sanger.ac.uk

Ext: 2314

Telephone: 01223 492314

-- The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a company
registered in England with number 2742969, whose registered office is 215
Euston Road, London, NW1 2BE.