[Bioperl-l] Can't parse blast report written by Bio::SearchIO::Writer::TextResultWriter
Jason Stajich
jason at bioperl.org
Mon May 12 23:53:15 UTC 2008
okay - so there's a bug - I remember someone tried to fix something
in the writers recently so will have to look and see how that got
broken and can be fixed.
-j
On May 12, 2008, at 4:26 PM, Prachi Shah wrote:
> Hi Jason,
>
> The negative coordinates in the HSP show up when I generate a Text
> report regardless of how/if I sort the HSP order. I think it has
> something to do with the frame. In the example I gave, the Query
> sequence matches the subject sequence on the negative strand. My guess
> is that TextResultWriter somehow takes the strand into account and
> tries to recalculates the start and stop locations?
>
> Thanks,
> Prachi
>
> On Mon, May 12, 2008 at 4:21 PM, Jason Stajich <jason at bioperl.org>
> wrote:
>> that's a very strange bug - I don't quite understand where it is
>> coming
>> from. IF you don't mess with the HSP order and start with a
>> report and
>> generate the Text report output, does it also give the negative
>> coordinates
>> or are you still reconstituting the Hit/HSP objects "manually" in
>> your code?
>>
>> -jason
>>
>>
>> On May 12, 2008, at 4:17 PM, Prachi Shah wrote:
>>
>>
>>> Thanks Jason for adding the sort_hsps method in
>>> Bio::Search::Hit::GenericHit. I tested it out and it works great.
>>>
>>> The other issue I have is the format of HSP start and stop
>>> coordinates
>>> when I write a new blast report (with HSPs sorted) using
>>> Bio::SearchIO::Writer::TextResultWriter. Below is an example of the
>>> same HSP alignment as output from BLAST and later when the blast
>>> report is generated by TextResultWriter. Notice, the change in start
>>> and stop coordinates. I would like to keep the start and stop format
>>> as in the first case. How do I specify that? Any indicators are
>>> greatly appreciated.
>>>
>>> Thanks,
>>> Prachi
>>>
>>>
>> ---------------------------------------------------------------------
>> -------------------------------
>>> **HSP alignment in blast report generated by BLAST itself:
>>>
>>> Score = 10150 (1529.0 bits), Expect = 0., Sum P(3) = 0.
>>> Identities = 2120/2345 (90%), Positives = 2120/2345 (90%), Strand =
>>> Minus / Plus
>>>
>>> Query: 2364
>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG 2305
>>>
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251160
>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG
>>> 2251219
>>>
>>> Query: 2304
>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA 2245
>>>
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251220
>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA
>>> 2251279
>>>
>>> Query: 2244
>> ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC 2185
>>>
>>> |||||||||||||| |
>>> Sbjct: 2251280
>> ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC
>>> 2251339
>>>
>>> Query: 2184
>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG 2125
>>>
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251340
>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG
>>> 2251399
>>>
>>> Query: 2124
>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG 2065
>>>
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251400
>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG
>>> 2251459
>>>
>>> Query: 2064
>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG 2005
>>>
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251460
>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG
>>> 2251519
>>>
>>> Query: 2004
>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG 1945
>>>
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251520
>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG
>>> 2251579
>>>
>>> Query: 1944
>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA 1885
>>>
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251580
>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA
>>> 2251639
>>>
>>> Query: 1884
>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA 1825
>>>
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251640
>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA
>>> 2251699
>>>
>>>
>>>
>> ---------------------------------------------------------------------
>> -------------------------------
>>> ** HSP alignment written by TextResultWriter:
>>>
>>> Score = 1529.0 bits (10150), Expect = 0., P = 0.
>>> Identities = 2120/2345 (90%)
>>> Frame = -1 / +1
>>>
>>> Query: 20
>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG -39
>>>
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251160
>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG
>>> 2251219
>>>
>>> Query: -40
>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA -99
>>>
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251220
>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA
>>> 2251279
>>>
>>> Query: -100
>> ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC -159
>>>
>>> |||||||||||||| |
>>> Sbjct: 2251280
>> ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC
>>> 2251339
>>>
>>> Query: -160
>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG -219
>>>
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251340
>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG
>>> 2251399
>>>
>>> Query: -220
>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG -279
>>>
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251400
>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG
>>> 2251459
>>>
>>> Query: -280
>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG -339
>>>
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251460
>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG
>>> 2251519
>>>
>>> Query: -340
>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG -399
>>>
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251520
>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG
>>> 2251579
>>>
>>> Query: -400
>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA -459
>>>
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251580
>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA
>>> 2251639
>>>
>>> Query: -460
>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA -519
>>>
>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 2251640
>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA
>>> 2251699
>>>
>>
>>
More information about the Bioperl-l
mailing list