[Biojava-l] blast SAX parser
Russell Smithies
russell.smithies at xtra.co.nz
Thu Jun 26 12:07:46 EDT 2003
Some people may call this cheating but I wrote a simple utility
pre-processor for blast XML to convert it into something a basic SAX parser
can read :-)
-----------------------------------------------------------
import java.io.*;
public class XMLPreProcessor{
/**
* A simple utility method to create a new XML file containing data
* converted from the default blast -m7 XML format into something that
* can be easily read by a standard SAX parser.
*
* @param inFileName name of file in default blast -m7 format
* @param outfileName name of output file converted to SAX-parser
compliant XML
* @author Russell Smithies
*/
public void process(String inFileName, String outfileName){
try{
BufferedReader in = new BufferedReader(new FileReader(new
File(inFileName)));
BufferedWriter out = new BufferedWriter(new FileWriter(outfileName));
StringBuffer sb = null;
//print XML version header
out.write(in.readLine());
out.newLine();
while(in.ready()){
String line = in.readLine();
//preserve single line comments containing DTD stuff
if(line.indexOf("<!") >= 0){
out.write(line);
out.newLine();
//XML header type node
} else if(line.indexOf(">") == line.length() - 1){
out.write(line);
out.newLine();
//prune crap out of other lines
} else{
sb = new StringBuffer(line);
sb.replace(sb.indexOf(">"), sb.indexOf(">") + 1, "=\"");
sb.delete(sb.lastIndexOf("<"), sb.length() - 1);
sb.insert(sb.length() - 1, "\"/");
sb.replace(sb.indexOf("_"), sb.indexOf("_") + 1, " ");
out.write(sb.toString());
out.newLine();
}
}
out.flush();
out.close();
} catch(IOException ex){
ex.printStackTrace();
}
}
}
--------------------------------------------------------------------------
it produces nice looking XML but it's probably not worth adding to biojava.
Russell
More information about the Biojava-l
mailing list