[DAS] dsn

Titus Brown titus@caltech.edu
Sun, 24 Mar 2002 13:26:41 -0800


-> > My question about best practice remains, though: how do people organize
-> > lots of little sequences that don't yet share a coordinate system. Here
-> > is the simplest case we have: two ~ 1kb sequences from the opposite ends
-> > of a ~ 5kb insert (call these STCs).
-> > 
-> >    DP |<----.....~ 3kb.....---->| TP
-> > 
-> > I would like to have one entry point for the pair, but since the size of
-> > the unsequenced part in the middle isn't known precisely, I can't invent 
-> > a common coordinate system for the STC pair to which to attach the entry
-> > point. Each sequence defines a little, independant, coordinate system.
-> > 
-> I can only tell you how I observe the Drosophila genome people doing it.
-> 
-> They just place the fragments with the correct relative orientations (if
-> known) spaced by Ns (approximate number if known, it not an fixed number
-> to represent an unknown number).  So in the early phase of sequencing of
-> P1s/BACs the sequenced fragments derived from the P1s were shrapnel of
-> this kind often with arbitrary orientation.  With time, contigs coalesced.
-> The most painful phase was when enough coalescence occurred to determine
-> the actual cytological orientation of the sequences and large numbers of
-> contigs flipped orientation making it necessary to remap the personal
-> annotations accumulated previously.

Speaking as someone in possession of no more than 5% of their model organism's
genome ;) (~50 MB, most of it in STCs), you simply can't use DAS to organize
the sequences.  Moreover, we have no idea of orientation or relation between
the BACs that we do have sequenced, because we don't have any mapping info.

I'm hoping to put up ~40 full BACs for DAS viewers to see, since we plan to
use DAS extensively for annotation viewing.  For now I'm simply going to make
a bunch of independent entry points, and treat each BAC as a separate
"chromosome".

--titus