[Biojava-l] BLAST Parser for extracting all BLAST data?
Y D Sun
Yudong.Sun at newcastle.ac.uk
Sun Jun 26 05:42:08 EDT 2005
Hi,
I want to extract all data from BLASTP results. In the following hit,
for example, I need to get the lengths of query and subject proteins,
the identities (including all data 54, 124 and 43%), the positives (all
data 79, 124 and 63%), and the gaps (3, 124 and 2%). Can the
BLASTLikeSAXParser filter all these information? I can't find the
methods in SeqSimilaritySearchHit and SeqSimilaritySearchSubHit APIs to
retrieve these data. Does Biojava provide any methods for this purpose?
Thanks,
George
BLASTP 2.2.5 [Nov-16-2002]
Query= Prot0001
(138 letters)
Database: /work/nys1/fasta/protein/AE000782.pro.fasta
2407 sequences; 662,866 total letters
Searching.....done
Score
E
Sequences producing significant alignments: (bits)
Value
Prot0002 100
1e-23
Prot0003 74
2e-15
Prot0004 43
3e-06
>Prot0002
Length = 138
Score = 100 bits (250), Expect = 1e-23
Identities = 54/124 (43%), Positives = 79/124 (63%), Gaps = 3/124 (2%)
Query: 18 NARTKFTDIAKTLNLTEAAIRKRIKKLEENQIIKRYSIDIDYKKLGYNMAIIGLDIDMDY
77
NAR T IAK LN+TEAA+RKRI LE + I Y I+YKK+G + ++ G+D+D D
Sbjct: 15 NARIPKTRIAKELNVTEAAVRKRIANLERREEILGYKAIINYKKVGLSASLTGVDVDPDK
74
Query: 78 FPKIIKELEKRKEFLHIYSSAGDHDIMVIAIYK---DLEEIYNYLKNLKGVKRVCPAIII
134
K+++EL+ + ++ + GDH IM I K +L EI+ + ++GVKRVCP+II
Sbjct: 75 LWKVVEELKDLESVKSLWLTTGDHTIMAEIIAKSVQELSEIHQKIAEMEGVKRVCPSIIT
134
Query: 135 DQIK 138
D +K
Sbjct: 135 DIVK 138
More information about the Biojava-l
mailing list