[Bioperl-l] Error parsing blast results with blasttable
michael watson (IAH-C)
michael.watson at bbsrc.ac.uk
Fri Jan 7 08:51:18 EST 2005
OK, I was under the impression (mistakenly) that hits from blasttable
output didn't have HSPs.
-----Original Message-----
From: Jason Stajich [mailto:jason.stajich at duke.edu]
Sent: 07 January 2005 13:49
To: michael watson (IAH-C)
Cc: Bioperl List; Marc Logghe
Subject: Re: [Bioperl-l] Error parsing blast results with blasttable
On Jan 7, 2005, at 8:34 AM, michael watson ((IAH-C)) wrote:
> Actually, the very first answer from Marc solved a bug in my script -
> I'm simply trying to get the query start and end of the HSPs, I don't
> need them to be linked together into a coherent hit object with
> multiple HSPs, I'd be happy with them separate. I'm not trying to do
> anything fancy, just mark up the HSPs as features on the query
> sequence.
>
> When running the exact same script using "blast" format instead of
> "blasttable" I get exactly what I need - I was trying to use
> blasttable for efficiency sake though.
>
Sure. that is the right thing to do. but calling $hit->start
$hit->end is not really doing what you want so don't use it.
If you want the start and end of HSPs you need to be calling that on
the HSPs themselves. If you look at the searchIO howto
you'll see this construct
while (my $result = $searchio->next_result) {
while(my $hit = $result->next_hit) {
while( my $hsp = $hit->next_hsp) {
print $hsp->query->stary, " ",$hsp->query->end,
"\n";
}
}
}
That will work and get you the HSP start and end for the query. use
$hsp->hit->start and $hsp->hit->end to get the start/end of the hit
sequence coordinates in the HSP.
-jason
> Thanks
> Mick
>
> -----Original Message-----
> From: Jason Stajich [mailto:jason.stajich at duke.edu]
> Sent: 07 January 2005 13:30
> To: michael watson (IAH-C)
> Cc: Bioperl List; Marc Logghe
> Subject: Re: [Bioperl-l] Error parsing blast results with blasttable
>
>
> Right - but back to my question - what do you want to be getting out?
> Do you want the smallest HSP start position if you are calling
> $hit->start('query')? Are you hoping for fancy HSP tiling?
>
> I'm pretty sure the problem you are showing has to do with being
> unable to build a single compatible tiling path for a set of HSPs. I
> just think the code for doing this is just too brittle. There may
> also be a bug since the blasttable parser has less data available too
> it and that may be the cause as well, so will have to be investigated
> nonetheless.
>
> -jason
> On Jan 7, 2005, at 8:21 AM, michael watson ((IAH-C)) wrote:
>
>> Hi
>>
>> I submitted a bug which contains some blasttable output, example
>> code,
>
>> and the error produced.
>>
>> Mick
>>
>> -----Original Message-----
>> From: Jason Stajich [mailto:jason.stajich at duke.edu]
>> Sent: 07 January 2005 13:04
>> To: michael watson (IAH-C)
>> Cc: Bioperl List; Marc Logghe
>> Subject: Re: [Bioperl-l] Error parsing blast results with blasttable
>>
>>
>> $hit->start is a convience function which first tiles the HSPs and
>> then gives you the smallest start.
>>
>> If you look at the documentation for the method you see that not
>> giving it a type will give you start in the query and hit.
>>
>> Usage : $sbjct->start( [seq_type] );
>> Purpose : Gets the start coordinate for the query, sbjct, or both
>> sequences
>> : in the BlastHit object. If there is more than one HSP,
>> the
>>
>> lowest start
>> : value of all HSPs is returned.
>> Example : $qbeg = $sbjct->start('query');
>> : $sbeg = $sbjct->start('hit');
>> : ($qbeg, $sbeg) = $sbjct->start();
>> Returns : scalar context: integer
>> : array context without args: list of two integers
>> (queryStart, sbjctStart)
>> : Array context can be "induced" by providing an argument
>> of
>>
>> 'list' or 'array'.
>> Argument : In scalar context: seq_type = 'query' or 'hit' or
>> 'sbjct' (default = 'query')
>> ('sbjct' is synonymous with 'hit')
>> Throws : n/a
>> Comments : This method requires that all HSPs be tiled. If there
>> is
>
>> more than one
>> : HSP and they have not already been tiled, they will be
>> tiled first automatically..
>> : Remember that the start and end coordinates of all HSPs
>> are
>> : normalized so that start < end. Strand information can
> be
>> : obtained by calling $hit->strand().
>>
>> I don't know why you are seeing concatenated positions unless you are
>> somehow getting it in array context and then turning it into a
>> string.
>>
>>
>> I really don't use this, if I want tiled HSPs I use WU-BLAST with the
>> -links and build compatible HSP groups.
>>
>> What are you trying to get - the smallest hit or query start? Just
>> the
>
>> start/end for HSPs?
>>
>> If this is somehow a blasttable specific problem will try and see if
>> can figure out why.
>>
>> -jason
>>
>> On Jan 7, 2005, at 5:50 AM, michael watson ((IAH-C)) wrote:
>>
>>> Hi
>>>
>>> Having done some more tests with this:
>>>
>>> $hit->start()
>>>
>>> Actually returns a string which is the concatenation of query start
>>> and subject end! (btw I'm using the "-m 8" option) - surely this
>>> isn't the desired option????
>>>
>>> If I change it to:
>>>
>>> $hit->start('query')
>>>
>>> Then I get the correct start back, but I still get the stack trace
>>> error.
>>>
>>> The two co-ordinate sets which cause the problem (3264-3268 and
>>> 3252-3268) are on adjacent lines in the file (3252-3268 is the next
>>> line after 3264-3268) and are to the SAME subject, ie they are two
>>> HSPs of the same hit (in theory) but they are to two VERY different
>>> parts of the query.
>>>
>>> I'm guessing the way blasttable handles multiple HSPs is causing the
>>> trouble.
>>>
>>> Mick
>>>
>>> -----Original Message-----
>>> From: Marc Logghe [mailto:Marc.Logghe at devgen.com]
>>> Sent: 07 January 2005 10:41
>>> To: michael watson (IAH-C); Bioperl List
>>> Subject: RE: [Bioperl-l] Error parsing blast results with blasttable
>>>
>>>
>>>> while (my $result = $searchio->next_result) {
>>>> while(my $hit = $result->next_hit) {
>>>> my $start = $hit->start;
>>>>
>>>> And it is that call to $hit->start that sets off the whole trace.
>>>>
>>>> Any ideas?
>>>
>>>
>>> Hi Mick,
>>> Have you tried one of these ?:
>>>
>>> my $start = $hit->start('sbjct'); # or 'query' or 'hit'. Latter is
>>> same as 'sbjct'
>>>
>>> or
>>>
>>> my $start = $hit->hsp->start('sbjct');
>>>
>>>
>>> I think in all cases it defaults to 'query'. So it should not crash
>>> but give you the start position of the query. I am afraid I can't
>>> explain the crash, sorry.
>>>
>>> Marc
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> --
>> Jason Stajich
>> jason.stajich at duke.edu
>> http://www.duke.edu/~jes12/
>>
>>
> --
> Jason Stajich
> jason.stajich at duke.edu
> http://www.duke.edu/~jes12/
>
>
--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/
More information about the Bioperl-l
mailing list