[Biojava-l] DAS client: how to retrieve features for a sequence region

Jonathan Warren jw12 at sanger.ac.uk
Thu Apr 29 08:26:40 UTC 2010


The link I gave you http://www.ebi.ac.uk/~rafael/dokuwiki/doku.php?id=das:courses:dasobert 
  shows examples of how to connect to 'European' style das sources.  
For the UCSC and GBrowse type DAS sources you may have to play around  
with the urls to get the info you want as they work slightly  
differently to other DAS data sources and use the types to filter  
data. I would suggest contacting the UCSC for more info.

The dasobert library is what you should use- the DASSequenceDB.java  
that you are currently looking at in biojava are old and not really  
supported anymore.

> I was hoping to be able to use most of the functionality especially  
> for the parsing of the XML and creating the URLs by means of  
> functions/methods that are already around…
this is what the dasobert library is for ;)


On 29 Apr 2010, at 07:30, Bernd Jagla wrote:

> Hi Jonathan,
>
> Just to clarify, I need to write my own das client? I was hoping to  
> be able to use most of the functionality especially for the parsing  
> of the XML and creating the URLs by means of functions/methods that  
> are already around…
> I am now going into debug mode for the DAS package in biojava to  
> look for the XML parsing, if you any further pointers on specific  
> methods I should be looking at it would mean a lot to me…
> In short, I think I can create the URLs from scratch with not much  
> effort. I don’t currently know how to put the XML into a data  
> structure and how this data structure should look like.
>
> Thanks for your kind help,
>
> Bernd
>
> From: Jonathan Warren [mailto:jw12 at sanger.ac.uk]
> Sent: Wednesday, April 28, 2010 10:21 PM
> To: Bernd Jagla
> Cc: biojava-l at lists.open-bio.org
> Subject: Re: [Biojava-l] DAS client: how to retrieve features for a  
> sequence region
>
> Hi Bernd
>
> For the UCSC you need to filter on types. see http://genome.ucsc.edu/FAQ/FAQdownloads.html#downloads 
>  there is a section called "Downloading data from the UCSC DAS server"
>
> for DAS libraries you can see a tutorial here http://www.biodas.org/wiki/DASWorkshop2010#Day_2
>
> the one you would be most interested in is the Dasobert tutorial (http://www.ebi.ac.uk/~rafael/dokuwiki/doku.php?id=das:courses:dasobert 
> ) for DAS client creation, but there is a also a good javascript  
> library as well called JSDas.
>
> Any more info then don't hesitate to ask.
>
> Jonathan.
>
>
> On 28 Apr 2010, at 08:25, Bernd Jagla wrote:
>
>
> Hi there,
>
> I am trying to retrieve information (features) from the UCSC genome  
> browser
> using the DAS interface.
> I am looking at the org.biojava.bio.program.das sources. I can  
> retrieve all
> top level entry points with
> DASSequenceDB(dbURL)
> (Apperently the last entry from the return XML object gives a
> [Fatal Error] :1:1: Content is not allowed in prolog.
> Which I am ignoring...)
>
> and also the DSN entries using:
> DAS das = new DAS();
>    das.addDasURL(new URL(dbURLString));
>    for(Iterator i = das.getReferenceServers().iterator();  
> i.hasNext(); )
> {....
>
> When I try to access features for a top level entry point, i.e. a  
> reference
> sequence I have the impression that first all features for a given  
> reference
> sequence are being downloaded.
>
> My questions:
>
> How can I access only the features of a specific region? I guess in  
> DAS
> terms I want to specify the segment part of the URL
> (http://genome.ucsc.edu/cgi-bin/das/hg17/features?segment=22:15000000,160000
> 00).
>
> I would also like to get the list of available features. How can I  
> achieve
> this? From a wireshark output I can see that this is being retrieved  
> somehow
> behind the scene. How can I access this information?
>
> I am looking at TestDAS*.java; are there any other examples around  
> that I
> can use to learn from?
>
> Thanks a lot for your kind support,
>
> Best,
>
> Bernd
>
>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>
> Jonathan Warren
> Senior Developer and DAS coordinator
> jw12 at sanger.ac.uk
> Ext: 2314
> Telephone: 01223 492314
>
>
>
>
>
>
> -- The Wellcome Trust Sanger Institute is operated by Genome  
> Research Limited, a charity registered in England with number  
> 1021457 and a company registered in England with number 2742969,  
> whose registered office is 215 Euston Road, London, NW1 2BE.

Jonathan Warren
Senior Developer and DAS coordinator
jw12 at sanger.ac.uk
Ext: 2314
Telephone: 01223 492314








-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 





More information about the Biojava-l mailing list