[Biojava-l] StringIndexOutOfBoundsException while parsing blast result

David Toomey dtoomey at rcsi.ie
Wed Oct 1 08:40:44 UTC 2008


They are on the same OS. For all my tests I have run the blast search and
parsing on the same OS. This has mostly been windows but I have also tried
the whole thing on Linux and I get the same problem.
I have done some more testing and I don't think the carriage return is the
problem.
What I have found is that if the second line is less than 11 characters the
error is thrown. If I add 4 spaces in front of the 'GN=ISPF' on the second
line then it is parsed correctly, like this.

2,4-cyclodiphosphate synthase OS=Plasmodium falciparum (isolate 3D7) 
    GN=ISPF

I haven't figured out why it parses correctly when it is the only entry in
the file, even without the spaces. So maybe I am still missing something.

Cheers,

Dave

-----Original Message-----
From: dicknetherlands at gmail.com [mailto:dicknetherlands at gmail.com] On Behalf
Of Richard Holland
Sent: 30 September 2008 17:31
To: David Toomey
Cc: biojava-l at lists.open-bio.org
Subject: Re: [Biojava-l] StringIndexOutOfBoundsException while parsing blast
result

Sounds like it _might_ be something to do with the carriage return
itself. Is the blast file generated on the same OS that you're running
your analysis on? (e.g. you might run Blast on a Linux box, but
attempt to parse the file on a Windows box?). If the two OSes are
different, this might point to it - as Linux won't necessarily
understand the Windows linebreaks, or vice versa, and might
misinterpret them. When you copy the portion of the file to a new file
on the OS you're running the analysis on, it will substitute its own
local linebreaks and thus mask the problem.

So the first thing I'd check is to what the two OSes involved are. If
they're different, try running your analysis program on the same OS as
the Blast output was generated on. If that does fix it, then try
putting your Blast files through dos2unix or something similar to
convert the linebreaks before running your analysis program.

If they're the same OS, then we still have a problem!

cheers,
Richard







More information about the Biojava-l mailing list