[Biojava-l] SCF: support for ambiguities

Richard Holland holland at eaglegenomics.com
Mon Nov 3 18:22:29 UTC 2008


Thanks for the fix. I'll review this and get back to you in a couple of days.

cheers,
Richard

2008/11/3 community at struck.lu <community at struck.lu>:
> I have added the missing ambiguities to DNATools.java and then used these in
> SCF.java.
> The two patches are appended to this email.
>
> Greetings,
> Daniel
>
>
> "Richard Holland" <holland at eaglegenomics.com> wrote:
>
>> A patch would be much appreciated!
>>
>> cheers,
>> Richard
>>
>> 2008/10/31 community at struck.lu <community at struck.lu>:
>> > True. It was a first quick and dirty hack to get the rest of my project
>> going.
>> >
>> > I think adding support of the IUPAC ambiguities to DNATools would be the
>> most
>> > approbate solution. The SCF class can then easily be adapted.
>> >
>> > Are there any plans to do so?
>> > If not, I could give it a try and submit a patch for DNATools and SCF.
>> >
>> > Greetings,
>> > Daniel
>> >
>> > "Richard Holland" <holland at eaglegenomics.com> wrote:
>> >
>> >> It is the correct method, yes.
>> >>
>> >> However your code constructs a new hash set every time it does the
>> >> check for W or S etc.. It would be much more efficient to create
>> >> class-static references to the ambiguity symbols you need, instead of
>> >> (re)creating them every time they're encountered. A class-static gap
>> >> symbol reference would also be good in this situation.
>> >>
>> >> cheers,
>> >> Richard
>> >>
>> >>
>> >>
>> >> 2008/10/31 community at struck.lu <community at struck.lu>:
>> >> > Hello,
>> >> >
>> >> >
>> >> > I am using the SCF class in the context of HIV-1 population sequencing.
>> In
>> >> > this context we do have sometimes ambiguous base calls. To support them
> I
>> >> > extended the SCF class to allow for IUPAC ambiguities up to 2
>> nucleotides.
>> >> >
>> >> > Therefore I simply added the following code to the "decode" function:
>> >> >
>> >> > #########################
>> >> >        public Symbol decode(byte call) throws IllegalSymbolException {
>> >> >
>> >> >            //get the DNA Alphabet
>> >> >            Alphabet dna = DNATools.getDNA();
>> >> >
>> >> >            char c = (char) call;
>> >> >            switch (c) {
>> >> >                case 'a':
>> >> >                case 'A':
>> >> >                    return DNATools.a();
>> >> >                case 'c':
>> >> >                case 'C':
>> >> >                    return DNATools.c();
>> >> >                case 'g':
>> >> >                case 'G':
>> >> >                    return DNATools.g();
>> >> >                case 't':
>> >> >                case 'T':
>> >> >                    return DNATools.t();
>> >> >                case 'n':
>> >> >                case 'N':
>> >> >                    return DNATools.n();
>> >> >                case '-':
>> >> >                    return DNATools.getDNA().getGapSymbol();
>> >> >                case 'w':
>> >> >                case 'W':
>> >> >                    //make the 'W' symbol
>> >> >                    Set symbolsThatMakeW = new HashSet();
>> >> >                    symbolsThatMakeW.add(DNATools.a());
>> >> >                    symbolsThatMakeW.add(DNATools.t());
>> >> >                    Symbol w = dna.getAmbiguity(symbolsThatMakeW);
>> >> >                    return w;
>> >> >                case 's':
>> >> >                case 'S':
>> >> >                    //make the 'S' symbol
>> >> >                    Set symbolsThatMakeS = new HashSet();
>> >> >                    symbolsThatMakeS.add(DNATools.c());
>> >> >                    symbolsThatMakeS.add(DNATools.g());
>> >> >                    Symbol s = dna.getAmbiguity(symbolsThatMakeS);
>> >> >                    return s;
>> >> > ... (and so on)
>> >> > #########################
>> >> >
>> >> > Is this the right way to do it? And if so, how can this code be
> submitted
>> > to
>> >> > the official biojava source code?
>> >> >
>> >> >
>> >> > Best regards,
>> >> > Daniel Struck
>> >> > _________________________________________________________
>> >> > Mail sent using root eSolutions Webmailer - www.root.lu
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> >> > http://lists.open-bio.org/mailman/listinfo/biojava-l
>> >> >
>> >>
>> >>
>> >
>> >
>> > _________________________________________________________
>> > Mail sent using root eSolutions Webmailer - www.root.lu
>> >
>> >
>> >
>>
>>
>
>
> _________________________________________________________
> Mail sent using root eSolutions Webmailer - www.root.lu
>



-- 
Richard Holland, BSc MBCS
Finance Director, Eagle Genomics Ltd
M: +44 7500 438846 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/



More information about the Biojava-l mailing list