[BioPython] HSPs in Blast parser
    Bzy Bee 
    nomy2020 at yahoo.com
       
    Fri Apr 30 00:26:00 EDT 2004
    
    
  
Hi
 
I am stuck on parsing a BlastN output and would appreciate some help. I am working on multiple HSPs for a single hit . For example if there are two hsps found for one hit, I need to find where query and subject ends for one hsp and then compare it with the query and subject start for the next hsp, e.g. in the following example:
 
>test_seq1
          Length = 424
 Score =  841 bits (424), Expect = 0.0
 Identities = 424/424 (100%)
 Strand = Plus / Plus
                                                                       
Query: 1   ggactggttcgtcgtttacaagctgccggcccacacagggtcgggagatgcgacgcagaa 60
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1   ggactggttcgtcgtttacaagctgccggcccacacagggtcgggagatgcgacgcagaa 60
                                                                       
Query: 61  cggcctgcggtacaagtactttgacgaacactcagaagactggagcgacggcgtggggtt 120
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 61  cggcctgcggtacaagtactttgacgaacactcagaagactggagcgacggcgtggggtt 120
                                                                       
 Score =  226 bits (114), Expect = 2e-58
 Identities = 141/150 (94%)
 Strand = Plus / Plus
                                                                       
Query: 275 ccagctcgcctttgtgctctacaatgaccaaccgcctaaatgcagcgagtgtaaggactc 334
           ||||||||||||||||||||||||||||||||||||||||| |||||||| |||||||||
Sbjct: 513 ccagctcgcctttgtgctctacaatgaccaaccgcctaaatccagcgagtctaaggactc 572
                                                                       
Query: 335 ttgcagtcgtgggcacacgaagggtgtgctgctcctggaccaagaagggggcttgtggtt 394
           || ||||||||||||||||||||||||||||||||||||||||||||||||||| |||||
Sbjct: 573 ttccagtcgtgggcacacgaagggtgtgctgctcctggaccaagaagggggcttctggtt 632
 
I am interetsed in where Query and sbjct ended in first hsp (i.e. 120, 120) and where it started in the second hsp (i.e. 275, 513).
 
I have noticed that in the blast parser one can iterate through each hsp for every single hit, but am not too sure how to treat two hsps of a single hit as related and iterate through the two hsps of a single hit in order to find the query (and subject) end of one and query (and subject) start of the other.
 
Any help would be highly appreciated.
 
Thanks
 
Jawad Ali
		
---------------------------------
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs 
    
    
More information about the BioPython
mailing list