[Biojava-l] SCF: support for ambiguities
Richard Holland
holland at eaglegenomics.com
Mon Nov 3 18:22:29 UTC 2008
Thanks for the fix. I'll review this and get back to you in a couple of days.
cheers,
Richard
2008/11/3 community at struck.lu <community at struck.lu>:
> I have added the missing ambiguities to DNATools.java and then used these in
> SCF.java.
> The two patches are appended to this email.
>
> Greetings,
> Daniel
>
>
> "Richard Holland" <holland at eaglegenomics.com> wrote:
>
>> A patch would be much appreciated!
>>
>> cheers,
>> Richard
>>
>> 2008/10/31 community at struck.lu <community at struck.lu>:
>> > True. It was a first quick and dirty hack to get the rest of my project
>> going.
>> >
>> > I think adding support of the IUPAC ambiguities to DNATools would be the
>> most
>> > approbate solution. The SCF class can then easily be adapted.
>> >
>> > Are there any plans to do so?
>> > If not, I could give it a try and submit a patch for DNATools and SCF.
>> >
>> > Greetings,
>> > Daniel
>> >
>> > "Richard Holland" <holland at eaglegenomics.com> wrote:
>> >
>> >> It is the correct method, yes.
>> >>
>> >> However your code constructs a new hash set every time it does the
>> >> check for W or S etc.. It would be much more efficient to create
>> >> class-static references to the ambiguity symbols you need, instead of
>> >> (re)creating them every time they're encountered. A class-static gap
>> >> symbol reference would also be good in this situation.
>> >>
>> >> cheers,
>> >> Richard
>> >>
>> >>
>> >>
>> >> 2008/10/31 community at struck.lu <community at struck.lu>:
>> >> > Hello,
>> >> >
>> >> >
>> >> > I am using the SCF class in the context of HIV-1 population sequencing.
>> In
>> >> > this context we do have sometimes ambiguous base calls. To support them
> I
>> >> > extended the SCF class to allow for IUPAC ambiguities up to 2
>> nucleotides.
>> >> >
>> >> > Therefore I simply added the following code to the "decode" function:
>> >> >
>> >> > #########################
>> >> > public Symbol decode(byte call) throws IllegalSymbolException {
>> >> >
>> >> > //get the DNA Alphabet
>> >> > Alphabet dna = DNATools.getDNA();
>> >> >
>> >> > char c = (char) call;
>> >> > switch (c) {
>> >> > case 'a':
>> >> > case 'A':
>> >> > return DNATools.a();
>> >> > case 'c':
>> >> > case 'C':
>> >> > return DNATools.c();
>> >> > case 'g':
>> >> > case 'G':
>> >> > return DNATools.g();
>> >> > case 't':
>> >> > case 'T':
>> >> > return DNATools.t();
>> >> > case 'n':
>> >> > case 'N':
>> >> > return DNATools.n();
>> >> > case '-':
>> >> > return DNATools.getDNA().getGapSymbol();
>> >> > case 'w':
>> >> > case 'W':
>> >> > //make the 'W' symbol
>> >> > Set symbolsThatMakeW = new HashSet();
>> >> > symbolsThatMakeW.add(DNATools.a());
>> >> > symbolsThatMakeW.add(DNATools.t());
>> >> > Symbol w = dna.getAmbiguity(symbolsThatMakeW);
>> >> > return w;
>> >> > case 's':
>> >> > case 'S':
>> >> > //make the 'S' symbol
>> >> > Set symbolsThatMakeS = new HashSet();
>> >> > symbolsThatMakeS.add(DNATools.c());
>> >> > symbolsThatMakeS.add(DNATools.g());
>> >> > Symbol s = dna.getAmbiguity(symbolsThatMakeS);
>> >> > return s;
>> >> > ... (and so on)
>> >> > #########################
>> >> >
>> >> > Is this the right way to do it? And if so, how can this code be
> submitted
>> > to
>> >> > the official biojava source code?
>> >> >
>> >> >
>> >> > Best regards,
>> >> > Daniel Struck
>> >> > _________________________________________________________
>> >> > Mail sent using root eSolutions Webmailer - www.root.lu
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > Biojava-l mailing list - Biojava-l at lists.open-bio.org
>> >> > http://lists.open-bio.org/mailman/listinfo/biojava-l
>> >> >
>> >>
>> >>
>> >
>> >
>> > _________________________________________________________
>> > Mail sent using root eSolutions Webmailer - www.root.lu
>> >
>> >
>> >
>>
>>
>
>
> _________________________________________________________
> Mail sent using root eSolutions Webmailer - www.root.lu
>
--
Richard Holland, BSc MBCS
Finance Director, Eagle Genomics Ltd
M: +44 7500 438846 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/
More information about the Biojava-l
mailing list