[DAS] How do you map novel sequences and retrieve local feature info?

Oliver Lyttelton o_lyttelton@hotmail.com
Fri, 31 May 2002 14:39:48 +0000


Dear all,
    I want first to thank Thomas Down for his swift and illuminating answer 
this morning. Unfortunately, while it has helped me delve deeper, it has not 
yet taken me out of the woods.

The basic problem is that I want to automatically map short sequences 
(100-600bp) which a colleague is retrieving via wet-lab experiments, onto 
the human genome. I then want to retrieve a list of all interesting features 
located near the sequence of interest.

NCBI offer an automated http protocol for retrieving the results of blast 
queries in xml format but the problem is that their human genome database 
returns hits relative to NT_XXXXXX segment numbers and these do not always 
match up to the ensembl draft sequence. I need to use the ensembl blast 
sequence, so that I can retrieve local features from the ensembl das 
service.

The obvious answer is to use the ensembl human genome blast page which 
returns hits relative to contigs on the golden path, conveniently compatible 
with the das features request from the ensembl das website.

This works manually, but there doesn't appear to be any automated standard 
http protocol. I could still hack it, and just manually create the required 
http requests and disect the responses using RE pattern matching, but they 
could change the screens without notification, and XML would be a lot nicer 
than having to decode the blast_format blast results page.

I don't want to have to store a local copy of the genome assembly because 
this seems like cracking an egg with a mallet, and I ain't got the disk 
space....

Has anyone got any ideas?

_________________________________________________________________
Chat with friends online, try MSN Messenger: http://messenger.msn.com