[Biojava-l] BLAST Parser for extracting all BLAST data?
    Y D Sun 
    Yudong.Sun at newcastle.ac.uk
       
    Sun Jun 26 05:42:08 EDT 2005
    
    
  
Hi,
I want to extract all data from BLASTP results. In the following hit,
for example, I need to get the lengths of query and subject proteins,
the identities (including all data 54, 124 and 43%), the positives (all
data 79, 124 and 63%), and the gaps (3, 124 and 2%). Can the
BLASTLikeSAXParser filter all these information? I can't find the
methods in SeqSimilaritySearchHit and SeqSimilaritySearchSubHit APIs to
retrieve these data. Does Biojava provide any methods for this purpose?
Thanks,
George
BLASTP 2.2.5 [Nov-16-2002]
Query= Prot0001
         (138 letters)
Database: /work/nys1/fasta/protein/AE000782.pro.fasta
           2407 sequences; 662,866 total letters
Searching.....done
                                                                 Score
E
Sequences producing significant alignments:                      (bits)
Value
Prot0002                                                           100
1e-23
Prot0003                                                            74
2e-15
Prot0004                                                            43
3e-06
>Prot0002
          Length = 138
 Score =  100 bits (250), Expect = 1e-23
 Identities = 54/124 (43%), Positives = 79/124 (63%), Gaps = 3/124 (2%)
Query: 18  NARTKFTDIAKTLNLTEAAIRKRIKKLEENQIIKRYSIDIDYKKLGYNMAIIGLDIDMDY
77
           NAR   T IAK LN+TEAA+RKRI  LE  + I  Y   I+YKK+G + ++ G+D+D D
Sbjct: 15  NARIPKTRIAKELNVTEAAVRKRIANLERREEILGYKAIINYKKVGLSASLTGVDVDPDK
74
Query: 78  FPKIIKELEKRKEFLHIYSSAGDHDIMVIAIYK---DLEEIYNYLKNLKGVKRVCPAIII
134
             K+++EL+  +    ++ + GDH IM   I K   +L EI+  +  ++GVKRVCP+II
Sbjct: 75  LWKVVEELKDLESVKSLWLTTGDHTIMAEIIAKSVQELSEIHQKIAEMEGVKRVCPSIIT
134
Query: 135 DQIK 138
           D +K
Sbjct: 135 DIVK 138
    
    
More information about the Biojava-l
mailing list