[Biojava-dev] [Bug 2311] New: Bug in PHYLIPFileBuilder with protein sequences

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Thu Jun 7 04:07:47 UTC 2007


http://bugzilla.open-bio.org/show_bug.cgi?id=2311

           Summary: Bug in PHYLIPFileBuilder with protein sequences
           Product: BioJava
           Version: unspecified
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: minor
          Priority: P2
         Component: seq.io
        AssignedTo: biojava-dev at biojava.org
        ReportedBy: felipe.albrecht at gmail.com


Hello, im trying to convert a protein multiple alignment in fasta
format to phylip format.

The source is:

               BufferedReader br = new BufferedReader(new FileReader(args[0]));
               PHYLIPFileBuilder builder = new PHYLIPFileBuilder();
               RichSequenceIterator richSequenceIterator =
IOTools.readFastaProtein(br, null);
               List<Sequence> l = new LinkedList<Sequence>();
               Sequence seq = null;
               while (richSequenceIterator.hasNext()) {
                       l.add(richSequenceIterator.nextSequence());
               }
               builder.startFile();
               builder.setSequenceCount(l.size());
               builder.setSitesCount(seq.seqString().length());
               for (Sequence sequence : l) {
                       builder.setCurrentSequenceName(sequence.getName());
                       builder.receiveSequence(sequence.seqString());
               }
               builder.endFile();

As I said, my input data is a protein multiple alignment.



Running this source, this trace is showed:

Exception in thread "main" org.biojava.bio.BioError: Something has
gone badly wrong with DNA
       at org.biojava.bio.seq.DNATools.createDNASequence(DNATools.java:199)
       at
org.biojava.bio.seq.DNATools.createGappedDNASequence(DNATools.java:207)
       at
org.biojavax.bio.phylo.io.phylip.PHYLIPFileBuilder.createSequences(PHYLIPFileBuilder.java:121)
       at
org.biojavax.bio.phylo.io.phylip.PHYLIPFileBuilder.buildAlignment(PHYLIPFileBuilder.java:94)
       at
org.biojavax.bio.phylo.io.phylip.PHYLIPFileBuilder.endFile(PHYLIPFileBuilder.java:63)
       at FastaParser.main(FastaParser.java:54)
Caused by: org.biojava.bio.symbol.IllegalSymbolException: This
tokenization doesn't contain character: 'Q'
       at
org.biojava.bio.seq.io.CharacterTokenization.parseTokenChar(CharacterTokenization.java:175)
       at
org.biojava.bio.seq.io.CharacterTokenization$TPStreamParser.characters(CharacterTokenization.java:246)
       at
org.biojava.bio.symbol.SimpleSymbolList.<init>(SimpleSymbolList.java:178)
       at org.biojava.bio.seq.DNATools.createDNA(DNATools.java:173)
       at org.biojava.bio.seq.DNATools.createDNASequence(DNATools.java:195)
       ... 5 more

IMHO, the bug is at the line 121 of PHYLIPFileBuilder.java, method
createSequence , where is done:
       try {
         DNATools.createGappedDNASequence(sequence, name);
       } catch (IllegalSymbolException e) {
         isDNA = false;
       }

Where is execpeted that DNATools.CreateGappedDNASequence throws a
IllegalSymbolException , but seeking this method, in the file
DNAToos.java line 198:
   } catch (BioException se) {
     throw new BioError("Something has gone badly wrong with DNA", se);
   }
Being the IllegalSymbolException subclass of the BioError, they is
"catched" and a new exception is created and in
NATools.createSequences they arent catched.


I solved this problem adding:
   } catch (IllegalSymbolException ie) {
       throw ie;
   }

in createDNASequence, but it's a workaround for the exception be catched.

A better solution, is check the type of the sequence. Exist a method
for discover is the sequence is DNA/RNA/Protein/mistake? If yes, uses
it, also the exceptions must be used when occurs an exception and dont
for flow control.

PS: Im using the biojava source code downloaded today from
http://www.biojava.org/download/bj15b/all/biojava-1.5-beta2.tar.gz


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the biojava-dev mailing list