[Biojava-l] parsing BLAST result
Charles Imbusch
charles at imbusch.net
Wed Jul 23 09:40:30 UTC 2008
Hello,
for a project I have to parse Blast output files. To do this I used the code
provided on this page:
http://biojava.org/wiki/BioJava:CookBook:Blast:Parser
I'm interested in the start and stop positions of the subject I align
with, so
I adjusted the code a bit so that it looks like:
//list the hits
for (Iterator k = result.getHits().iterator(); k.hasNext(); ) {
SeqSimilaritySearchHit hit = (SeqSimilaritySearchHit)k.next();
System.out.print("\tmatch: "+hit.getSubjectID());
System.out.print("\tSubSeqStart: "+hit.getSubjectStart());
System.out.print("\tSubSeqStop: "+hit.getSubjectEnd());
System.out.println("\te score: "+hit.getEValue());
}
I execute "java BlastParserOriginal S2431-F.fasta.txt" and have a look
at the
best hit:
...
match: 48_scaffold.txt SubSeqStart: 3320 SubSeqStop: 2952643 e
score: 0.0
...
The subject id is correct but the numbers are just nonsense. It should
be 610956 for the start
and 610367 for the end position.
This doesn't happen will all Blast result files but with some. Is there
a solution for that? How
do you parse the Blast files?
I just uploaded the Blast output to http://charles.imbusch.net/tmp/
Any answer is appreciated.
Cheers,
Charles
More information about the Biojava-l
mailing list