[Biojava-l] SCF Parser issues with gaps and unread peak values ?

Andy Yates ayates at ebi.ac.uk
Fri Jun 19 15:37:10 UTC 2009


Hi Umanga,

That does sound like a problem with the decoder. It's been quite a bit
of time since I last looked at the SCF code. Maybe Franklin who posted
to this list recently (whom I'm guessing is a bit more used to SCF file
formats at the moment) could help if he would be so willing :)

Andy

Ashika Umanga Umagiliya wrote:
> Hi Andy,
> 
> Thank you for the help.
> Yes,looking at the trace the bases has to be TGTTGTTAGG as you said.
> 
> I get Chromatogram using SCF parse like :
> 
> Chromatogram chroma = SCF.create(scfFile);
> 
> then when I print the bases using
> 
> chroma.getBaseCalls()
> 
> It shows the bases sequence "TGTTGTAGG" (T is missing).
> So I put a System.out.println() to the decode method in SCF parser class
> , that ie at :
> 
> 
> static class DefaultUncertaintyDecoder implements 
> BaseCallUncertaintyDecoder {
>        public DefaultUncertaintyDecoder() { }
> 
>        public Symbol decode(byte call) throws IllegalSymbolException {
>            char c = (char) call;
>            System.out.print(""+c);
> ..
> ..
> 
> 
> But here also it showed "TGTTGTAGG" (with missubg T).
> 
> So I thought this is not an issue with rendering logic, but in the parser.
> 
> 
> Thanks again,
> Umanga
> 
> 
> 
> 
> Andy Yates wrote:
>> Hi Umanga,
>>
>> Looking at your graphic the SCF parser hasn't missed out the T. Looking
>> at the trace I can read the bases to be TGTTGTTAGGG. What I think has
>> happened is the callboxes are assuming a more uniformed trace since the
>> G has bled a bit too much into the previous T.
>>
>> My knowledge of the callbox code is non-existent as my previous work on
>> these types of graphics did not call for this kind of rendering (we
>> didn't use any base calls as we were looking for variation in samples
>> with varying copy numbers so they were quite subtle). Have a look in the
>> options for the call box rendering & see if there is anything which can
>> limit the size of a callbox & the bounds it is rendered to. Also looking
>> at the basecalled positions might be a good idea as that would give a
>> clue about what the graphics are attempting to do WRT this data
>>
>> Hope that helps.
>>
>> Andy
>>
>> Ashika Umanga Umagiliya wrote:
>>  
>>> Greetings all,
>>>
>>> Please refer to image at:  http://i43.tinypic.com/sfdjs6.png
>>>
>>> As shown in Figure1 , I am drawing bases using Phrap ACE file and
>>> drawing chromatographs using
>>> Biojava 'ChromatoGraphics' . I use .SCF file generated by phrep.
>>>
>>> As can be seen in Figure 1 , the alignment breaks from the 'Magenta
>>> callbase-box'.The reason is may be the 'SCF parser' has ignored one of
>>> the peaks which should be a 'T'.
>>>
>>>
>>> Figure 2 : Shows what need to be happen.If somehow if I can fill it with
>>> a T (or may be sometimes a Gap) I can save the alignment between
>>> sequences.
>>>
>>> Figure 3 : Shows the same Chromatograph using ChromasPro , It has
>>> identified the peak as a T.
>>>
>>> I went throught the biojavas SCF  parser , I noticed that it finally
>>> creates a 'SimpleAlignment'.Can I utilize 'GappedSymbolList' and fix
>>> this?
>>> Is there a way to read the gaps (or incases like this , the correct base
>>> T) and make these two align.
>>>
>>> Thanks in advance,
>>> Umanga
>>> _______________________________________________
>>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>>     
> 
> 



More information about the Biojava-l mailing list