[DAS] Restricting the range of an alignment query.

Javier Herrero jherrero at ebi.ac.uk
Thu Aug 12 13:15:04 UTC 2010


Hi Thomas

On Thursday 12 Aug 2010 12:06:19 Thomas Down wrote:
> On Thu, Aug 12, 2010 at 10:41 AM, Andy Jenkinson
> 
> <andy.jenkinson at ebi.ac.uk>wrote:
> > OK so you're a lot further along, sounds good. A while back we were
> > aiming to get compara alignments as DAS too (which necessitated the cols
> > parameter).
> 
> By "compara alignments", you're talking about the MySQL ensembl-compara
> database, right?

Yes, that is what we discussed some time ago.

> Is there still any interest in this on the Ensembl side?  It's something
> I'm going to be needing soon, too (my current chain-file-based server
> doesn't handle all the cases I'm interested in).

I guess the interest must come from "the other side". I am quite keen on 
providing alignments through DAS if people and/or DAS clients will use them 
and if that is not too heavy for our servers. You can imagine that things can 
go horribly wrong if one asked all 33-way EPO alignments on a chromosome at 
once. This can probably be controlled in the server.

Another question is whether it is easy to fit genomic alignments into the 
current dasalignment structure. I am not sure how to interpret things like 
dbAccessionId, objectVersion, dbSource, etc for a genomic alignment. In other 
words, should protein and genomic alignments share the same query and 
response?

Javier

> > > As regards to what would be necessary to do what you want, I don't
> > > think
> > 
> > we can change the subject parameter unless the existing alignment servers
> > and clients using it can be changed, i.e. Pfam (not sure if there are
> > others). I can't really think of another way to do it off the top of my
> > head - the 'query' parameter has space for start/end positions and can
> > be a sequence identifier, but this is more like 'get all alignments
> > containing this sequence' which is not quite the same thing. It would
> > also be a bit clunky to describe - what would "?query=alignment42:30,40"
> > do?
> > 
> > > Any suggestions?
> > > 
> > > Well, the obvious thing is to couple the coordinate restrictions to the
> > 
> > sequence to which they apply.
> > 
> > > Simplest solution I can think of would be to add:
> > >            ?segment=seqName[:start,end]
> > > 
> > > ...where:
> > >            ?segment=P12345
> > > 
> > > ...is synonymous with:
> > >            ?subject=P12345        (which would still be supported),
> > > 
> > > ...but...
> > > 
> > >            ?segment=22:30000000,30200000
> > > 
> > > ...does what I want.   Maybe not the cleanest solution, but I don't
> > > think
> > 
> > it's going to horribly break anything (unless there are subtleties I'm
> > missing here?)
> > 
> > >                      Thomas.
> > 
> > I think you're right in that it is going to need another parameter to
> > make it work. Any objections from anyone? What would happen if you
> > specified multiple segments which did not correspond to the same
> > section? Return multiple blocks representing multiple horizontal
> > sections?
> 
> That's the idea.
> 
> Use case for this: user is viewing 22:30000000,30200000, then zooms out.
>  Trigger a fetch for:
> 
>            ?segment=22:29900000,29999999;segment=22:30200001,30300000
> 
> ...then merge the results into the working set.
> 
>                     Thomas.
> _______________________________________________
> DAS mailing list
> DAS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/das

-- 
Javier Herrero, PhD
Ensembl Compara Project Leader
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge - CB10 1SD - UK



More information about the DAS mailing list