[Biojava-l] GenBank Parser Exception

Matthew Pocock mrp@sanger.ac.uk
Thu, 22 Nov 2001 11:54:09 +0000


Thanks Ron,

I've patched this in. Does something similar need to be done for Embl?

Matthew

Ron Kuhn wrote:

> I have another fix for an exception that I got when parsing GenBank
> sequences that have strands defined on the LOCUS line (e.g. AF343912).
> BioJava assumes that the strands and topology (when both given) are separate
> tokens. This is not true. Here is the fix for the processHeaderLine method
> in GenbankFormat.java:
> 
> Substitute the following code for the code inside the if LOCUS:
> 
> if (line.startsWith(GenbankFormat.LOCUS_TAG)
> {
>     // the LOCUS line is a special case because it contains the
>     // locus, size, molecule type, GenBank division, and the date
>     // of last modification.
>     if (line.length() < 73)
>     	throw new ParseException("LOCUS line too short [" + line + "]");
>     	
>     saveSeqAnno2(GenbankFormat.LOCUS_TAG, line.substring(12, 22));
>     saveSeqAnno2(GenbankFormat.SIZE_TAG, line.substring(22, 29));
>     saveSeqAnno2(GenbankFormat.STRAND_NUMBER_TAG, line.substring(33, 35));
>     saveSeqAnno2(GenbankFormat.TYPE_TAG, line.substring(36, 41));
>     saveSeqAnno2(GenbankFormat.CIRCULAR_TAG, line.substring(42, 52));
>     saveSeqAnno2(GenbankFormat.DIVISION_TAG, line.substring(52, 55));
>     saveSeqAnno2(GenbankFormat.DATE_TAG, line.substring(62, 73));
> }
> 
> And add the supporting method:
>     /**
>      * Private method to process a header tag and associated value.
>      *
>      * @param tag The tag to add
>      * @param value The value of the associated tag
>      * @throws ParseException Thrown when an error occurs parsing the file
>      */
> 	private void saveSeqAnno2(String tag, String value)
> 	throws ParseException
> 	{
> 		value = value.trim();	// strip whitespace
> 		if (value.length() > 0) {
> 			this.saveSeqAnno();
> 			headerTag = tag;
> 	    	headerTagText = new StringBuffer(value);
> 		}
> 	}
> 
> Ron Kuhn
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> .
> 
>