[Biojava-l] SCF Parser issues with gaps and unread peak values ?
Andy Yates
ayates at ebi.ac.uk
Fri Jun 19 15:37:10 UTC 2009
Hi Umanga,
That does sound like a problem with the decoder. It's been quite a bit
of time since I last looked at the SCF code. Maybe Franklin who posted
to this list recently (whom I'm guessing is a bit more used to SCF file
formats at the moment) could help if he would be so willing :)
Andy
Ashika Umanga Umagiliya wrote:
> Hi Andy,
>
> Thank you for the help.
> Yes,looking at the trace the bases has to be TGTTGTTAGG as you said.
>
> I get Chromatogram using SCF parse like :
>
> Chromatogram chroma = SCF.create(scfFile);
>
> then when I print the bases using
>
> chroma.getBaseCalls()
>
> It shows the bases sequence "TGTTGTAGG" (T is missing).
> So I put a System.out.println() to the decode method in SCF parser class
> , that ie at :
>
>
> static class DefaultUncertaintyDecoder implements
> BaseCallUncertaintyDecoder {
> public DefaultUncertaintyDecoder() { }
>
> public Symbol decode(byte call) throws IllegalSymbolException {
> char c = (char) call;
> System.out.print(""+c);
> ..
> ..
>
>
> But here also it showed "TGTTGTAGG" (with missubg T).
>
> So I thought this is not an issue with rendering logic, but in the parser.
>
>
> Thanks again,
> Umanga
>
>
>
>
> Andy Yates wrote:
>> Hi Umanga,
>>
>> Looking at your graphic the SCF parser hasn't missed out the T. Looking
>> at the trace I can read the bases to be TGTTGTTAGGG. What I think has
>> happened is the callboxes are assuming a more uniformed trace since the
>> G has bled a bit too much into the previous T.
>>
>> My knowledge of the callbox code is non-existent as my previous work on
>> these types of graphics did not call for this kind of rendering (we
>> didn't use any base calls as we were looking for variation in samples
>> with varying copy numbers so they were quite subtle). Have a look in the
>> options for the call box rendering & see if there is anything which can
>> limit the size of a callbox & the bounds it is rendered to. Also looking
>> at the basecalled positions might be a good idea as that would give a
>> clue about what the graphics are attempting to do WRT this data
>>
>> Hope that helps.
>>
>> Andy
>>
>> Ashika Umanga Umagiliya wrote:
>>
>>> Greetings all,
>>>
>>> Please refer to image at: http://i43.tinypic.com/sfdjs6.png
>>>
>>> As shown in Figure1 , I am drawing bases using Phrap ACE file and
>>> drawing chromatographs using
>>> Biojava 'ChromatoGraphics' . I use .SCF file generated by phrep.
>>>
>>> As can be seen in Figure 1 , the alignment breaks from the 'Magenta
>>> callbase-box'.The reason is may be the 'SCF parser' has ignored one of
>>> the peaks which should be a 'T'.
>>>
>>>
>>> Figure 2 : Shows what need to be happen.If somehow if I can fill it with
>>> a T (or may be sometimes a Gap) I can save the alignment between
>>> sequences.
>>>
>>> Figure 3 : Shows the same Chromatograph using ChromasPro , It has
>>> identified the peak as a T.
>>>
>>> I went throught the biojavas SCF parser , I noticed that it finally
>>> creates a 'SimpleAlignment'.Can I utilize 'GappedSymbolList' and fix
>>> this?
>>> Is there a way to read the gaps (or incases like this , the correct base
>>> T) and make these two align.
>>>
>>> Thanks in advance,
>>> Umanga
>>> _______________________________________________
>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>>
>
>
More information about the Biojava-l
mailing list