[Biojava-l] Global alignment problem (bug?) in NeedlemanWunsch

Chris Friedline cfriedline at vcu.edu
Tue Oct 26 19:29:30 UTC 2010


That's something I'll need to go back and revisit after my deadline
passes at the end of this week. Initially, I was creating them on the
fly at the time of alignment, but it would be more efficient to store
them that way in the gene object itself.  I was also passing an
InputStreamReader for the substitution matrix each time (pulling the
matrix from my jar), but storing it as a string would also be a better
option, especially since I'm threading and there are so many
alignments.

Chris

On Tue, Oct 26, 2010 at 3:23 PM, Andreas Prlic <andreas at sdsc.edu> wrote:
>
> ok, how do you create the biojava3 Sequence objects? just trying to
> find out where the bottlenecks are, so we can fix them...
>
> A
>
> On Tue, Oct 26, 2010 at 12:20 PM, Chris Friedline <cfriedline at vcu.edu> wrote:
> > Hi,
> > The io should be the same, since I've used the same set of genes for testing
> > both.  So, it's either the alignment calculation or the new biojava design
> > contributing to the slowness.
> > Chris
> >
> > On Tue, Oct 26, 2010 at 2:42 PM, Andreas Prlic <andreas at sdsc.edu> wrote:
> >>
> >> Hi Chris,
> >>
> >> about your comment that the biojava3-alignment is slower than the 1.7
> >> one: Do you have any data if this is coming from the io or is the
> >> actual alignment calculation slower?
> >>
> >> Andreas
> >>
> >> On Sun, Oct 24, 2010 at 7:57 AM, Chris Friedline <cfriedline at vcu.edu>
> >> wrote:
> >> > Hello,
> >> >
> >> > I am getting a weird problem with protein alignment using
> >> > NeedlemanWunsch in 1.7.1, in that the alignment does not span the
> >> > entire length of the proteins.  I've verified that this should not
> >> > happen with needle (from EMBOSS), neobio, BioJava3, and NW on NCBI.
> >> > I'm reluctant to switch to BioJava3 at this time, since performance is
> >> > about 2-3x slower than 1.7.1 for the alignments, and I'm doing about
> >> > 350,000 of them.
> >> >
> >> > An example of this alignment error, is shown here:
> >> > http://pastebin.com/mdX516R6
> >> >
> >> > Notice that the alignment stops 1 amino acid short of the end in both
> >> > cases.  The parameters for the alignment are: BLOSUM50, gapOpen=10,
> >> > gapExtend=2.
> >> >
> >> > Thanks,
> >> > Chris
> >> >
> >> > --
> >> > PhD Candidate, Integrative Life Sciences
> >> > Virginia Commonwealth University
> >> > Richmond, VA
> >> > _______________________________________________
> >> > Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> >> > http://lists.open-bio.org/mailman/listinfo/biojava-l
> >> >
> >>
> >>
> >>
> >> --
> >> -----------------------------------------------------------------------
> >> Dr. Andreas Prlic
> >> Senior Scientist, RCSB PDB Protein Data Bank
> >> University of California, San Diego
> >> (+1) 858.246.0526
> >> -----------------------------------------------------------------------
> >
> >
> >
> > --
> > PhD Candidate, Integrative Life Sciences
> > Virginia Commonwealth University
> > Richmond, VA
> >
>
>
>
> --
> -----------------------------------------------------------------------
> Dr. Andreas Prlic
> Senior Scientist, RCSB PDB Protein Data Bank
> University of California, San Diego
> (+1) 858.246.0526
> -----------------------------------------------------------------------



--
PhD Candidate, Integrative Life Sciences
Virginia Commonwealth University
Richmond, VA




More information about the Biojava-l mailing list