[Biojava-dev] Bug in PHYLIPFileBuilder?

Felipe Albrecht felipe.albrecht at gmail.com
Thu Jun 7 04:02:55 UTC 2007


Sorry,
from my example file, substitute the line
builder.setSitesCount(seq.seqString().length());
to
builder.setSitesCount(l.get(l.size()-1).seqString().length());

Thanks

Felipe Albrecht


On 6/7/07, Felipe Albrecht <felipe.albrecht at gmail.com> wrote:
> Hello, im trying to convert a protein multiple alignment in fasta
> format to phylip format.
>
> The source is:
>
>                 BufferedReader br = new BufferedReader(new FileReader(args[0]));
>                 PHYLIPFileBuilder builder = new PHYLIPFileBuilder();
>                 RichSequenceIterator richSequenceIterator =
> IOTools.readFastaProtein(br, null);
>                 List<Sequence> l = new LinkedList<Sequence>();
>                 Sequence seq = null;
>                 while (richSequenceIterator.hasNext()) {
>                         l.add(richSequenceIterator.nextSequence());
>                 }
>                 builder.startFile();
>                 builder.setSequenceCount(l.size());
>                 builder.setSitesCount(seq.seqString().length());
>                 for (Sequence sequence : l) {
>                         builder.setCurrentSequenceName(sequence.getName());
>                         builder.receiveSequence(sequence.seqString());
>                 }
>                 builder.endFile();
>
> As I said, my input data is a protein multiple alignment.
>
> Running this source, this trace is showed:
>
> Exception in thread "main" org.biojava.bio.BioError: Something has
> gone badly wrong with DNA
>         at org.biojava.bio.seq.DNATools.createDNASequence(DNATools.java:199)
>         at org.biojava.bio.seq.DNATools.createGappedDNASequence(DNATools.java:207)
>         at org.biojavax.bio.phylo.io.phylip.PHYLIPFileBuilder.createSequences(PHYLIPFileBuilder.java:121)
>         at org.biojavax.bio.phylo.io.phylip.PHYLIPFileBuilder.buildAlignment(PHYLIPFileBuilder.java:94)
>         at org.biojavax.bio.phylo.io.phylip.PHYLIPFileBuilder.endFile(PHYLIPFileBuilder.java:63)
>         at FastaParser.main(FastaParser.java:54)
> Caused by: org.biojava.bio.symbol.IllegalSymbolException: This
> tokenization doesn't contain character: 'Q'
>         at org.biojava.bio.seq.io.CharacterTokenization.parseTokenChar(CharacterTokenization.java:175)
>         at org.biojava.bio.seq.io.CharacterTokenization$TPStreamParser.characters(CharacterTokenization.java:246)
>         at org.biojava.bio.symbol.SimpleSymbolList.<init>(SimpleSymbolList.java:178)
>         at org.biojava.bio.seq.DNATools.createDNA(DNATools.java:173)
>         at org.biojava.bio.seq.DNATools.createDNASequence(DNATools.java:195)
>         ... 5 more
>
> IMHO, the bug is at the line 121 of PHYLIPFileBuilder.java, method
> createSequence , where is done:
>         try {
>           DNATools.createGappedDNASequence(sequence, name);
>         } catch (IllegalSymbolException e) {
>           isDNA = false;
>         }
>
> Where is execpeted that DNATools.CreateGappedDNASequence throws a
> IllegalSymbolException , but seeking this method, in the file
> DNAToos.java line 198:
>     } catch (BioException se) {
>       throw new BioError("Something has gone badly wrong with DNA", se);
>     }
> Being the IllegalSymbolException subclass of the BioError, they is
> "catched" and a new exception is created and in
> NATools.createSequences they arent catched.
>
>
> I solved this problem adding:
>     } catch (IllegalSymbolException ie) {
>         throw ie;
>     }
>
> in createDNASequence, but it's a workaround for the exception be catched.
>
> A better solution, is check the type of the sequence. Exist a method
> for discover is the sequence is DNA/RNA/Protein/mistake? If yes, uses
> it, also the exceptions must be used when occurs an exception and dont
> for flow control.
>
> PS: Im using the biojava source code downloaded today from
> http://www.biojava.org/download/bj15b/all/biojava-1.5-beta2.tar.gz
>
> Thanks and Im waiting opinions.
>
> Felipe Albrecht
>



More information about the biojava-dev mailing list