[Bioperl-l] Error parsing blast results with blasttable

Jason Stajich jason.stajich at duke.edu
Fri Jan 7 08:48:30 EST 2005


On Jan 7, 2005, at 8:34 AM, michael watson ((IAH-C)) wrote:

> Actually, the very first answer from Marc solved a bug in my script -
> I'm simply trying to get the query start and end of the HSPs, I don't
> need them to be linked together into a coherent hit object with 
> multiple
> HSPs, I'd be happy with them separate.  I'm not trying to do anything
> fancy, just mark up the HSPs as features on the query sequence.
>
> When running the exact same script using "blast" format instead of
> "blasttable" I get exactly what I need - I was trying to use blasttable
> for efficiency sake though.
>

Sure.  that is the right thing to do.  but calling $hit->start 
$hit->end is not really doing what you want so don't use it.


If you want the start and end of HSPs you need to be calling that on 
the HSPs themselves.  If you look at the searchIO howto
you'll see this construct

while (my $result = $searchio->next_result) {
         while(my $hit = $result->next_hit) {
		while( my $hsp = $hit->next_hsp) {
			print $hsp->query->stary, " ",$hsp->query->end, "\n";
		}
	}
}

That will work and get you the HSP start and end for the query.  use 
$hsp->hit->start and $hsp->hit->end to get the start/end of the hit 
sequence coordinates in the HSP.

-jason


> Thanks
> Mick
>
> -----Original Message-----
> From: Jason Stajich [mailto:jason.stajich at duke.edu]
> Sent: 07 January 2005 13:30
> To: michael watson (IAH-C)
> Cc: Bioperl List; Marc Logghe
> Subject: Re: [Bioperl-l] Error parsing blast results with blasttable
>
>
> Right - but back to my question - what do you want to be getting out?
> Do you want the smallest HSP start position if you are calling
> $hit->start('query')?  Are you hoping for fancy HSP tiling?
>
> I'm pretty sure the problem you are showing has to do with being unable
> to build a single compatible tiling path for a set of HSPs.  I just
> think the code for doing this is just too brittle.  There may also be a
> bug since the blasttable parser has less data available too it and that
> may be the cause as well, so will have to be investigated nonetheless.
>
> -jason
> On Jan 7, 2005, at 8:21 AM, michael watson ((IAH-C)) wrote:
>
>> Hi
>>
>> I submitted a bug which contains some blasttable output, example code,
>
>> and the error produced.
>>
>> Mick
>>
>> -----Original Message-----
>> From: Jason Stajich [mailto:jason.stajich at duke.edu]
>> Sent: 07 January 2005 13:04
>> To: michael watson (IAH-C)
>> Cc: Bioperl List; Marc Logghe
>> Subject: Re: [Bioperl-l] Error parsing blast results with blasttable
>>
>>
>> $hit->start is a convience function which first tiles the HSPs and
>> then gives you the smallest start.
>>
>> If you look at the documentation for the method you see that not
>> giving it a type will give you start in the query and hit.
>>
>>   Usage     : $sbjct->start( [seq_type] );
>>   Purpose   : Gets the start coordinate for the query, sbjct, or both
>> sequences
>>             : in the BlastHit object. If there is more than one HSP,
>> the
>>
>> lowest start
>>             : value of all HSPs is returned.
>>   Example   : $qbeg = $sbjct->start('query');
>>             : $sbeg = $sbjct->start('hit');
>>             : ($qbeg, $sbeg) = $sbjct->start();
>>   Returns   : scalar context: integer
>>             : array context without args: list of two integers
>> (queryStart, sbjctStart)
>>             : Array context can be "induced" by providing an argument
>> of
>>
>> 'list' or 'array'.
>>   Argument  : In scalar context: seq_type = 'query' or 'hit' or
>> 'sbjct' (default = 'query')
>>               ('sbjct' is synonymous with 'hit')
>>   Throws    : n/a
>>   Comments  : This method requires that all HSPs be tiled. If there is
>
>> more than one
>>             : HSP and they have not already been tiled, they will be
>> tiled first automatically..
>>             : Remember that the start and end coordinates of all HSPs
>> are
>>             : normalized so that start < end. Strand information can
> be
>>             : obtained by calling $hit->strand().
>>
>> I don't know why you are seeing concatenated positions unless you are
>> somehow getting it in array context and then turning it into a string.
>>
>>
>> I really don't use this, if I want tiled HSPs I use WU-BLAST with the
>> -links and build compatible HSP groups.
>>
>> What are you trying to get - the smallest hit or query start? Just the
>
>> start/end for HSPs?
>>
>> If this is somehow a blasttable specific problem will try and see if
>> can figure out why.
>>
>> -jason
>>
>> On Jan 7, 2005, at 5:50 AM, michael watson ((IAH-C)) wrote:
>>
>>> Hi
>>>
>>> Having done some more tests with this:
>>>
>>> $hit->start()
>>>
>>> Actually returns a string which is the concatenation of query start
>>> and subject end!  (btw I'm using the "-m 8" option) - surely this
>>> isn't the desired option????
>>>
>>> If I change it to:
>>>
>>> $hit->start('query')
>>>
>>> Then I get the correct start back, but I still get the stack trace
>>> error.
>>>
>>> The two co-ordinate sets which cause the problem (3264-3268 and
>>> 3252-3268) are on adjacent lines in the file (3252-3268 is the next
>>> line after 3264-3268) and are to the SAME subject, ie they are two
>>> HSPs of the same hit (in theory) but they are to two VERY different
>>> parts of the
>>> query.
>>>
>>> I'm guessing the way blasttable handles multiple HSPs is causing the
>>> trouble.
>>>
>>> Mick
>>>
>>> -----Original Message-----
>>> From: Marc Logghe [mailto:Marc.Logghe at devgen.com]
>>> Sent: 07 January 2005 10:41
>>> To: michael watson (IAH-C); Bioperl List
>>> Subject: RE: [Bioperl-l] Error parsing blast results with blasttable
>>>
>>>
>>>> while (my $result = $searchio->next_result) {
>>>>         while(my $hit = $result->next_hit) {
>>>>                 my $start  = $hit->start;
>>>>
>>>> And it is that call to $hit->start that sets off the whole trace.
>>>>
>>>> Any ideas?
>>>
>>>
>>> Hi Mick,
>>> Have you tried one of these ?:
>>>
>>> my $start = $hit->start('sbjct');  # or 'query' or 'hit'. Latter is
>>> same as 'sbjct'
>>>
>>> or
>>>
>>> my $start = $hit->hsp->start('sbjct');
>>>
>>>
>>> I think in all cases it defaults to 'query'. So it should not crash
>>> but give you the start position of the query. I am afraid I can't
>>> explain the crash, sorry.
>>>
>>> Marc
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> --
>> Jason Stajich
>> jason.stajich at duke.edu
>> http://www.duke.edu/~jes12/
>>
>>
> --
> Jason Stajich
> jason.stajich at duke.edu
> http://www.duke.edu/~jes12/
>
>
--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/



More information about the Bioperl-l mailing list