[Biojava-l] Error writing large Fasta seqs?

Schreiber, Mark mark.schreiber@agresearch.co.nz
Mon, 2 Apr 2001 14:38:44 +1200


If I make the following change to FastaFormat the problem is removed but I
am concerned about if this was the true cause of the problem:

    public void writeSequence(Sequence seq, PrintStream os) {
        os.print(">");
        os.println(describeSequence(seq));

        //  int length = seq.length();
//    	for(int i = 1; i <= length; i++) {
//    	    os.write(seq.symbolAt(i).getToken());
//    	    if( (i % lineWidth) == 0)
//    		os.println();
//    	}
//    	if( (length % lineWidth) != 0)
//    	    os.println();

        for(int pos = 1; pos <= seq.length(); pos += lineWidth) {
            int end = Math.min(pos + lineWidth - 1, seq.length());
            os.println(seq.subStr(pos, end));
        }
    }

Originally the for statement read:


        for(int pos = 1; pos <= seq.length() +1; pos += lineWidth) {
            int end = Math.min(pos + lineWidth - 1, seq.length());
            os.println(seq.subStr(pos, end));
        }

The +1 after pos <= seq.length() seems to be incorrect given the <=
evaluation.

Is it likely that I have broken something?

Mark

> -----Original Message-----
> From: Schreiber, Mark [mailto:mark.schreiber@agresearch.co.nz]
> Sent: Monday, April 02, 2001 2:17 PM
> To: 'Biojava-L (E-mail)
> Subject: [Biojava-l] Error writing large Fasta seqs?
> 
> 
> Hi,
> 
> I have been getting IndexOutOfBoundsException errors when writing a
> SequenceDB of large sequences to a fasta file. I added some 
> diagnostics to
> the IndexOutOfBoundsException call in the 
> AbstractSymbolList$SubList.subStr
> and I get the stack trace below.
> 
> for some unknown reason FastaFormat is calling a subList 
> method using an end
> value less than the start. It looks like it might be tied up 
> with an error
> in the SubList of the AbstractSymbolList. Is there a limit on 
> the maximum
> size for a sequence.
> 
> This error doesn't always occur, just on large sequences (sometimes)
> 
> java.lang.IndexOutOfBoundsException: Start = 67381 End = 
> 67380 this.start =
> 10001 this.end = 77380 	at
> org.biojava.bio.symbol.AbstractSymbolList$SubList.subStr(Abstr
> actSymbolList.
> java:212) 	at
> org.biojava.bio.seq.impl.SimpleSequence.subStr(SimpleSequence.
> java:82) 	at
> org.biojava.bio.seq.io.FastaFormat.writeSequence(FastaFormat.j
> ava:187) 	at
> org.biojava.bio.seq.io.StreamWriter.writeStream(StreamWriter.java:63)
> 
> 
> Can anyone help??
> 
> Mark
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>