[Biojava-l] SCF: support for ambiguities

Richard Holland holland at eaglegenomics.com
Wed Nov 5 11:49:40 UTC 2008


This has now been applied to the trunk of biojava-live.

cheers,
Richard

2008/11/3 Richard Holland <holland at eaglegenomics.com>:
> Thanks for the fix. I'll review this and get back to you in a couple of days.
>
> cheers,
> Richard
>
> 2008/11/3 community at struck.lu <community at struck.lu>:
>> I have added the missing ambiguities to DNATools.java and then used these in
>> SCF.java.
>> The two patches are appended to this email.
>>
>> Greetings,
>> Daniel
>>
>>
>> "Richard Holland" <holland at eaglegenomics.com> wrote:
>>
>>> A patch would be much appreciated!
>>>
>>> cheers,
>>> Richard
>>>
>>> 2008/10/31 community at struck.lu <community at struck.lu>:
>>> > True. It was a first quick and dirty hack to get the rest of my project
>>> going.
>>> >
>>> > I think adding support of the IUPAC ambiguities to DNATools would be the
>>> most
>>> > approbate solution. The SCF class can then easily be adapted.
>>> >
>>> > Are there any plans to do so?
>>> > If not, I could give it a try and submit a patch for DNATools and SCF.
>>> >
>>> > Greetings,
>>> > Daniel
>>> >
>>> > "Richard Holland" <holland at eaglegenomics.com> wrote:
>>> >
>>> >> It is the correct method, yes.
>>> >>
>>> >> However your code constructs a new hash set every time it does the
>>> >> check for W or S etc.. It would be much more efficient to create
>>> >> class-static references to the ambiguity symbols you need, instead of
>>> >> (re)creating them every time they're encountered. A class-static gap
>>> >> symbol reference would also be good in this situation.
>>> >>
>>> >> cheers,
>>> >> Richard
>>> >>
>>> >>
>>> >>
>>> >> 2008/10/31 community at struck.lu <community at struck.lu>:
>>> >> > Hello,
>>> >> >
>>> >> >
>>> >> > I am using the SCF class in the context of HIV-1 population sequencing.
>>> In
>>> >> > this context we do have sometimes ambiguous base calls. To support them
>> I
>>> >> > extended the SCF class to allow for IUPAC ambiguities up to 2
>>> nucleotides.
>>> >> >
>>> >> > Therefore I simply added the following code to the "decode" function:
>>> >> >
>>> >> > #########################
>>> >> >        public Symbol decode(byte call) throws IllegalSymbolException {
>>> >> >
>>> >> >            //get the DNA Alphabet
>>> >> >            Alphabet dna = DNATools.getDNA();
>>> >> >
>>> >> >            char c = (char) call;
>>> >> >            switch (c) {
>>> >> >                case 'a':
>>> >> >                case 'A':
>>> >> >                    return DNATools.a();
>>> >> >                case 'c':
>>> >> >                case 'C':
>>> >> >                    return DNATools.c();
>>> >> >                case 'g':
>>> >> >                case 'G':
>>> >> >                    return DNATools.g();
>>> >> >                case 't':
>>> >> >                case 'T':
>>> >> >                    return DNATools.t();
>>> >> >                case 'n':
>>> >> >                case 'N':
>>> >> >                    return DNATools.n();
>>> >> >                case '-':
>>> >> >                    return DNATools.getDNA().getGapSymbol();
>>> >> >                case 'w':
>>> >> >                case 'W':
>>> >> >                    //make the 'W' symbol
>>> >> >                    Set symbolsThatMakeW = new HashSet();
>>> >> >                    symbolsThatMakeW.add(DNATools.a());
>>> >> >                    symbolsThatMakeW.add(DNATools.t());
>>> >> >                    Symbol w = dna.getAmbiguity(symbolsThatMakeW);
>>> >> >                    return w;
>>> >> >                case 's':
>>> >> >                case 'S':
>>> >> >                    //make the 'S' symbol
>>> >> >                    Set symbolsThatMakeS = new HashSet();
>>> >> >                    symbolsThatMakeS.add(DNATools.c());
>>> >> >                    symbolsThatMakeS.add(DNATools.g());
>>> >> >                    Symbol s = dna.getAmbiguity(symbolsThatMakeS);
>>> >> >                    return s;
>>> >> > ... (and so on)
>>> >> > #########################
>>> >> >
>>> >> > Is this the right way to do it? And if so, how can this code be
>> submitted
>>> > to
>>> >> > the official biojava source code?
>>> >> >
>>> >> >
>>> >> > Best regards,
>>> >> > Daniel Struck
>>> >> > _________________________________________________________
>>> >> > Mail sent using root eSolutions Webmailer - www.root.lu
>>> >> >
>>> >> >
>>> >> > _______________________________________________
>>> >> > Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>>> >> > http://lists.open-bio.org/mailman/listinfo/biojava-l
>>> >> >
>>> >>
>>> >>
>>> >
>>> >
>>> > _________________________________________________________
>>> > Mail sent using root eSolutions Webmailer - www.root.lu
>>> >
>>> >
>>> >
>>>
>>>
>>
>>
>> _________________________________________________________
>> Mail sent using root eSolutions Webmailer - www.root.lu
>>
>
>
>
> --
> Richard Holland, BSc MBCS
> Finance Director, Eagle Genomics Ltd
> M: +44 7500 438846 | E: holland at eaglegenomics.com
> http://www.eaglegenomics.com/
>



-- 
Richard Holland, BSc MBCS
Finance Director, Eagle Genomics Ltd
M: +44 7500 438846 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/



More information about the Biojava-l mailing list