[Biojava-l] ParseException when using interleaved Nexus file

Richard Holland holland at eaglegenomics.com
Tue Aug 11 09:17:03 UTC 2009


I've found the problem - "interleave=yes" is not valid according to  
the official NEXUS format spec which the parser was written against.  
(Maddison et al., 1997)

Interleaved file are supposed to only include the word "interleave" -  
it takes no parameters. Non-interleaved files shouldn't mention it at  
all.

I've modified the parser to tolerate this but I'd be interested to  
know where the invalid token came from - was it added manually, or by  
an existing piece of publically available software?

The modification has been made in the trunk of the biojava-live  
subversion repository.

cheers,
Richard

On 7 Aug 2009, at 11:10, David Johnson wrote:

> Hi Richard,
>
> Actually the original exception was thrown in a different file that my
> supervisor tried uploading to a Web app I'm developing that uses the
> BioJava Nexus parser, but can't get hold of that particular file
> today. So the one I provided the link for was just another example of
> an interleaved Nexus file I Googled for when I got your first email
> this morning, as I figured they'd probably be the same formatting. But
> I remember it's definitely the same exception in both cases.
>
> I had a quick look in the example I provided today, and the
> interleave=yes token is definitely in the header of the data block,
> and is also definitely in the Format line.
>
> Oh, just FYI, I'm using the BioJava 1.7 binary distribution
> (http://www.biojava.org/download/bj17/bin/biojava.jar).
>
> Cheers,
> -David
>
> 2009/8/7 Richard Holland <holland at eaglegenomics.com>:
>> Thanks David. One more quick question - is this the exact file that  
>> is
>> throwing the exception? I haven't tested it yet - but if I could test
>> against the real file that is throwing the problem, that would help  
>> me find
>> out exactly what's going wrong.
>>
>> For what it's worth, the exception is normally thrown when more  
>> than one
>> interleave=yes/no token is found in the header of the Data or  
>> Characters
>> block, or when the interleave token appears in a line other than  
>> the Format
>> line of the header.
>>
>> cheers,
>> Richard
>>
>> On 7 Aug 2009, at 10:28, David Johnson wrote:
>>
>>> Hi Richard,
>>>
>>> Thanks for your mail. An example of an interleaved file can be  
>>> found here:
>>>
>>> http://www.molecularevolution.org/si/resources/fileformats/files/dna.nex
>>>
>>> where the link pointing to the example file is from
>>> http://www.molecularevolution.org/si/resources/fileformats/ and  
>>> under
>>> the NEXUS section.
>>>
>>> The specific error message is:  
>>> "org.biojava.bio.seq.io.ParseException:
>>> Found unexpected token interleave=yes in CHARACTERS block"
>>>
>>> So it looks like the error is thrown reading the "interleave"
>>> parameter in the top of the data block, and before reaching the  
>>> actual
>>> interleaved matrix data. Full stacktrace in attached .txt.
>>>
>>> Cheers,
>>> -David
>>>
>>> 2009/8/7 Richard Holland <holland at eaglegenomics.com>:
>>>>
>>>> Could you point me to an example of an interleaved file?
>>>>
>>>> And also the full stack trace of the exception that gets thrown?
>>>>
>>>> cheers,
>>>> Richard
>>>>
>>>> On 6 Aug 2009, at 18:03, David Johnson wrote:
>>>>
>>>>> Hi everyone,
>>>>>
>>>>> A quick question about the BioJava Nexus parser. I've been  
>>>>> trying to
>>>>> use the Nexus file parser, simply by doing something like:
>>>>>
>>>>>      NexusFileBuilder builder = new NexusFileBuilder();
>>>>>      NexusFileFormat.parseFile(builder, f);
>>>>>
>>>>> However, when parsing Nexus files that are interleaved, I get a
>>>>> ParseException.
>>>>>
>>>>> Is there a way to setup the parser provided by BioJava to handle
>>>>> interleaved Nexus files?
>>>>>
>>>>> Thanks,
>>>>> -David
>>>>> --
>>>>>

--
Richard Holland, BSc MBCS
Operations and Delivery Director, Eagle Genomics Ltd
T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/




More information about the Biojava-l mailing list