[DAS] LDAS + Ensembl

Tony Cox avc@sanger.ac.uk
Thu, 28 Mar 2002 10:16:37 +0000 (GMT)


On Wed, 27 Mar 2002 lstein@formaggio.cshl.org wrote:

Hi Lincoln,

Thanks for tracking this down. I'll make the code fix for now but include the
updated Das2 version in the next code release - which should be for the mouse
release now that the assembly is all but "blessed". 

I also noticed the file extension thing after getting burned by it when
implementing the data upload scripts. One question: is bz2 a sensible supported
default format? - while lots of (linux) clients may be able to produce the files
it is not a default install on many unix servers like our Tru64 boxes. If you
can use bz2, you can use gz, and I believe the difference in compression to be
relatively small. All in the name of simplicity and maintainability. It is up
there with automatically assuming that everybodys tar supports the -z option.

Did you incorporate the changes that James made here, and sent you, to add back
the "sources" call? If we update the Das2 code and this is missing our upload
code will break. For that reason I'll roll your changes below into our code
ASAP and let you know. I'll check out a new copy of Das2 on our dev server and
test it out.

thanks

Tony

PS: yes, we do indeed filter out Component features. We also rely on the feature 
type IDs containing either "transcript" or "exon" to enable drawing of the
"humpy/bumpy" introns, rather than simply boxing the features up (naturally the
feature ID has to be the same as well!).


+>Hi,
+>
+>I've tracked down the problem with Jim's features not displaying on
+>Ensembl contigview.  I've got it fixed and working now with my
+>server.  To see:
+>
+>	 1) go to http://www.ensembl.org/perl/contigview?contig=Z96810.1.99682
+>	 2) add the das server http://brie2.cshl.org:8081/db/misc/das
+>	    DSN "freeman"
+>	 3) zoom way way in -- there are a LOT of features in the test
+>		 file -- at low mag it looks like a solid line (I was
+>		 going crazy trying to debug this "bug")
+>
+>There were a couple of issues that needed to be fixed.  I'll start
+>with the most important ones:
+>
+>     1) SOFTWARE VERSION SKEW (For Tony's attention!)
+>	The Bio-Das2 library on Ensembl needs to be updated.  A few
+>	weeks ago I added a new tag to the XML and updated Bio-Das2
+>	to accomodate it -- Thomas updated Dazzle.  You need Bio-Das2
+>	version 0.6 or higher.  This is available in the FTP directory
+>	at www.biodas.org, or via CVS (don't worry; only two lines
+>	changed).
+>
+>	If Tony can't make this change because of release schedules,
+>	etc., there's a quick workaround with the LDAS server.  See
+>	below.
+>
+>     2) LOAD PROBLEMS AT JIM'S END
+>        The load file must have the suffix .das (or .das.gz, .das.Z,
+>	.das.bz2 for compressed files).  Otherwise the loader assumes
+>	that it is a GFF-format file.  The sample file Jim sent me 
+>	ended in .txt!  This is a bit anal so I'm going to eliminate
+>	this restriction in the next version of the loader.
+>
+>     3) REDUNDANT DATA IN JIM'S FILE
+>	The same variations were annotated on multiple coordinate
+>	systems. I suppose this a reflection of frustration, but it
+>	isn't necessary.  A cleaned up load file that works is
+>	attached.
+>	
+>     4) CONFIGURATION FILE
+>	This was basically fine, but Jim might want to exclude
+>	features of type "Component" from the features dump.
+>	Otherwise it might show up in the Ensembl display (actually
+>	it doesn't seem to, probably because Tony filters it out).
+>	A slightly modified configuration file with the exclude=
+>	option is  attached.
+>
+>WORKAROUND TO MAKE LDAS WORK WITH ENSEMBL CONTIGVIEW (AS OF 3/27/02):
+>
+>   1) find the CGI script named "das"
+>   2) find the subroutine named error_segment() and comment it out
+>   3) replace the subroutine with a dummy error_segment() that does
+>      nothing:
+>
+>	sub error_segment { }
+>
+>Sorry for the delay in figuring all this out,
+>
+>Lincoln
+>
+>-- 
+>
+>

******************************************************
Tony Cox			Email:avc@sanger.ac.uk
Sanger Institute		WWW:www.sanger.ac.uk
Wellcome Trust Genome Campus	Webmaster
Hinxton				Tel: +44 1223 834244
Cambs. CB10 1SA			Fax: +44 1223 494919
******************************************************