[Bioperl-l] EST Alignment questions

Jamie Hatfield jamie@genome.arizona.edu
Fri, 1 Nov 2002 09:12:56 -0700


> -----Original Message-----
> From: Lincoln Stein [mailto:lstein@cshl.org] 
> 
> I want to build the alignment up a step at a time from BLAT 
> output.  BLAT 
> represents the alignment as a series of non-gapped pairs.  To 
> represent the 
> alignment correctly you have to add the appropriate pad characters.

I've kinda run into the same problem.  cap3 is what we use for assembly,
and it does pad the alignments with gaps (*'s).  But, I have to store
these gaps separate from the sequence (unless I want to store a new
sequence for each assembly, since a sequence won't always gap the same
way in different assemblies).  So now, I parse these gaps out, keep an
array of gap positions, and (when displaying the sequences) put them
back in, (along with the offset in front of the est to make it line up
with the consensus) so that the sequences line up correctly and
SimpleAlign is happy.  No fault implied to SimpleAlign.... It's simple.
I should probably just extend it to do what I want, but I'm not quite
*that* familiar with bioperl and ooperl yet.
 
> Also I need to insert HTML so that nonaligned bases change 
> color, and AlignIO 
> doesn't have the appropriate callbacks.  But that's a minor issue.
> 

I completely understand this part.  I have been doing this myself for
our project (WebEST, which has some searching features modeled after
HarvEST - http://harvest.ucr.edu/).  It was kinda annoying to do on my
own.  WebEST is the main drawing force in getting me looking at Bioperl.

WebEST is still very much in development, but if you want to see an
example of the way I display alignments, it's at
http://genome.arizona.edu/cgi-bin/WebEST/viewContig.cgi?AssemblyID=302&C
ontigID=520&zoom=0 for what it's worth.  It's more or less modeled after
consed.

------------------------------------------------------------------------
-
Jamie Hatfield                                Room 541H, Marley Building
Systems Programmer                            University of Arizona
Arizona Genomics Computational                Tucson, AZ  85721
  Laboratory (AGCoL)                          (520) 626-9598