[Bioperl-l] Can't parse blast report written by Bio::SearchIO::Writer::TextResultWriter
Chris Fields
cjfields at uiuc.edu
Tue May 13 00:33:25 UTC 2008
I ran some fixes on the writers recently. If we have the BLAST report
generating this I can work on debugging it (I'll file a bug for
tracking).
chris
On May 12, 2008, at 6:53 PM, Jason Stajich wrote:
> okay - so there's a bug - I remember someone tried to fix something
> in the writers recently so will have to look and see how that got
> broken and can be fixed.
> -j
> On May 12, 2008, at 4:26 PM, Prachi Shah wrote:
>
>> Hi Jason,
>>
>> The negative coordinates in the HSP show up when I generate a Text
>> report regardless of how/if I sort the HSP order. I think it has
>> something to do with the frame. In the example I gave, the Query
>> sequence matches the subject sequence on the negative strand. My
>> guess
>> is that TextResultWriter somehow takes the strand into account and
>> tries to recalculates the start and stop locations?
>>
>> Thanks,
>> Prachi
>>
>> On Mon, May 12, 2008 at 4:21 PM, Jason Stajich <jason at bioperl.org>
>> wrote:
>>> that's a very strange bug - I don't quite understand where it is
>>> coming
>>> from. IF you don't mess with the HSP order and start with a
>>> report and
>>> generate the Text report output, does it also give the negative
>>> coordinates
>>> or are you still reconstituting the Hit/HSP objects "manually" in
>>> your code?
>>>
>>> -jason
>>>
>>>
>>> On May 12, 2008, at 4:17 PM, Prachi Shah wrote:
>>>
>>>
>>>> Thanks Jason for adding the sort_hsps method in
>>>> Bio::Search::Hit::GenericHit. I tested it out and it works great.
>>>>
>>>> The other issue I have is the format of HSP start and stop
>>>> coordinates
>>>> when I write a new blast report (with HSPs sorted) using
>>>> Bio::SearchIO::Writer::TextResultWriter. Below is an example of the
>>>> same HSP alignment as output from BLAST and later when the blast
>>>> report is generated by TextResultWriter. Notice, the change in
>>>> start
>>>> and stop coordinates. I would like to keep the start and stop
>>>> format
>>>> as in the first case. How do I specify that? Any indicators are
>>>> greatly appreciated.
>>>>
>>>> Thanks,
>>>> Prachi
>>>>
>>>>
>>> ----------------------------------------------------------------------------------------------------
>>>> **HSP alignment in blast report generated by BLAST itself:
>>>>
>>>> Score = 10150 (1529.0 bits), Expect = 0., Sum P(3) = 0.
>>>> Identities = 2120/2345 (90%), Positives = 2120/2345 (90%), Strand =
>>>> Minus / Plus
>>>>
>>>> Query: 2364
>>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG 2305
>>>>
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251160
>>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG
>>>> 2251219
>>>>
>>>> Query: 2304
>>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA 2245
>>>>
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251220
>>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA
>>>> 2251279
>>>>
>>>> Query: 2244
>>> ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC 2185
>>>>
>>>> |||||||||||||| |
>>>> Sbjct: 2251280
>>> ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC
>>>> 2251339
>>>>
>>>> Query: 2184
>>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG 2125
>>>>
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251340
>>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG
>>>> 2251399
>>>>
>>>> Query: 2124
>>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG 2065
>>>>
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251400
>>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG
>>>> 2251459
>>>>
>>>> Query: 2064
>>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG 2005
>>>>
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251460
>>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG
>>>> 2251519
>>>>
>>>> Query: 2004
>>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG 1945
>>>>
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251520
>>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG
>>>> 2251579
>>>>
>>>> Query: 1944
>>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA 1885
>>>>
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251580
>>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA
>>>> 2251639
>>>>
>>>> Query: 1884
>>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA 1825
>>>>
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251640
>>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA
>>>> 2251699
>>>>
>>>>
>>>>
>>> ----------------------------------------------------------------------------------------------------
>>>> ** HSP alignment written by TextResultWriter:
>>>>
>>>> Score = 1529.0 bits (10150), Expect = 0., P = 0.
>>>> Identities = 2120/2345 (90%)
>>>> Frame = -1 / +1
>>>>
>>>> Query: 20
>>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG -39
>>>>
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251160
>>> CATATCCAGATCTATCTTGATGATTCTTATTAGAATATGTATCTGAAGATGTGCCACTTG
>>>> 2251219
>>>>
>>>> Query: -40
>>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA -99
>>>>
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251220
>>> TTGGAGGTGGTGGAGCTCTTCTAGCAGGAATAAGTTCAGATTTATTCATCAAATTATTCA
>>>> 2251279
>>>>
>>>> Query: -100
>>> ATGGTGAAACGTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNC -159
>>>>
>>>> |||||||||||||| |
>>>> Sbjct: 2251280
>>> ATGGTGAAACGTTTTTAGTATTATTATTGTTAGTGCTGTTGTTATTATTATTATTATTAC
>>>> 2251339
>>>>
>>>> Query: -160
>>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG -219
>>>>
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251340
>>> CAGAACTAGGTAATGAGCCTGATGATGATGTATGTTGGTGGGAAGAGCCATTTAGTTGTG
>>>> 2251399
>>>>
>>>> Query: -220
>>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG -279
>>>>
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251400
>>> TCAAATGATATGGAGTTGGTGGTTTTGGTGCAGCTCGACTAGGTTTGAATTGTGAGACAG
>>>> 2251459
>>>>
>>>> Query: -280
>>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG -339
>>>>
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251460
>>> TAGATTTTGCTGGAGGTTTTACCCATTCTTGTAAATTTGCCTCTTGGACATTGTTTTTGG
>>>> 2251519
>>>>
>>>> Query: -340
>>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG -399
>>>>
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251520
>>> CTGATGAGTAATTGTTAGGGTCATTATTATTATTGTTGGTTTTGGAATTGATCATGGGTG
>>>> 2251579
>>>>
>>>> Query: -400
>>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA -459
>>>>
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251580
>>> ATCCAATTGGAGTTCCAGCAGCAGAATTACCTCCATTTATATCGGAATAAAATTCTAAAA
>>>> 2251639
>>>>
>>>> Query: -460
>>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA -519
>>>>
>>>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>>>> Sbjct: 2251640
>>> CTTTAATAACAGCAACAGGATCTTTTTTCCAATCCTCATTAGTGATTTTCGAATGTTGTA
>>>> 2251699
>>>>
>>>
>>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list