[Biojava-l] SCF Parser issues with gaps and unread peak values ?

Ashika Umanga Umagiliya aumanga at biggjapan.com
Fri Jun 19 01:28:56 UTC 2009

Hi Andy,

Thank you for the help.
Yes,looking at the trace the bases has to be TGTTGTTAGG as you said.

I get Chromatogram using SCF parse like :

Chromatogram chroma = SCF.create(scfFile);

then when I print the bases using


It shows the bases sequence "TGTTGTAGG" (T is missing).
So I put a System.out.println() to the decode method in SCF parser class 
, that ie at :

 static class DefaultUncertaintyDecoder implements  
BaseCallUncertaintyDecoder {
        public DefaultUncertaintyDecoder() { }

        public Symbol decode(byte call) throws IllegalSymbolException {
            char c = (char) call;

But here also it showed "TGTTGTAGG" (with missubg T).

So I thought this is not an issue with rendering logic, but in the parser.

Thanks again,

Andy Yates wrote:
> Hi Umanga,
> Looking at your graphic the SCF parser hasn't missed out the T. Looking
> at the trace I can read the bases to be TGTTGTTAGGG. What I think has
> happened is the callboxes are assuming a more uniformed trace since the
> G has bled a bit too much into the previous T.
> My knowledge of the callbox code is non-existent as my previous work on
> these types of graphics did not call for this kind of rendering (we
> didn't use any base calls as we were looking for variation in samples
> with varying copy numbers so they were quite subtle). Have a look in the
> options for the call box rendering & see if there is anything which can
> limit the size of a callbox & the bounds it is rendered to. Also looking
> at the basecalled positions might be a good idea as that would give a
> clue about what the graphics are attempting to do WRT this data
> Hope that helps.
> Andy
> Ashika Umanga Umagiliya wrote:
>> Greetings all,
>> Please refer to image at:  http://i43.tinypic.com/sfdjs6.png
>> As shown in Figure1 , I am drawing bases using Phrap ACE file and
>> drawing chromatographs using
>> Biojava 'ChromatoGraphics' . I use .SCF file generated by phrep.
>> As can be seen in Figure 1 , the alignment breaks from the 'Magenta
>> callbase-box'.The reason is may be the 'SCF parser' has ignored one of
>> the peaks which should be a 'T'.
>> Figure 2 : Shows what need to be happen.If somehow if I can fill it with
>> a T (or may be sometimes a Gap) I can save the alignment between sequences.
>> Figure 3 : Shows the same Chromatograph using ChromasPro , It has
>> identified the peak as a T.
>> I went throught the biojavas SCF  parser , I noticed that it finally
>> creates a 'SimpleAlignment'.Can I utilize 'GappedSymbolList' and fix this?
>> Is there a way to read the gaps (or incases like this , the correct base
>> T) and make these two align.
>> Thanks in advance,
>> Umanga
>> _______________________________________________
>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l

More information about the Biojava-l mailing list