[Bioperl-l] Hit length using length_aln()

Brian Desany bdesany@bcm.tmc.edu
Mon, 15 Jul 2002 13:08:32 -0500


>OR, is there a better way to retieve the ratio (percent) of
>the entire hit to
>the query.

I think you mean "total hsp lengths" rather than the "entire hit" in your
query, otherwise you would just use $hit->length/$result->query_length which
probably isn't what you want.

(The length of the "hit" has no relation to the alignment, it is however
long the database entry is that has been hit, for example 5Mb for whatever
chromosome you happened to hit).

Also, I interpret your statment to mean "across all the hsps in the hit, how
much of the total query matched anything?" (Correct me if I have it wrong).

In any case I usually loop through the hsps as you do, but I use different
values:

e.g.:

my $hsps_len_test = 0;
my @hsp_s = $hit->hsps();
my $lenAll = $result->query_length();
foreach my $hsp (@hsp_s){
	$hsps_len_test += $hsp->length('query');
}
print "Ratio of hsps to hit: $hsps_len_test/$lenAll\n";


Note that this doesn't give you the right answer if your "hit" happens to
have repetitive elements in it, and your query matches it. In fact you could
potentially get a "total hsp lengths" of greater than the query length. In
that case you would have to look at the interval of the query that is
present in each hsp (start & end coordinates) and do some kind of interval
merging to find out what your final interval is. (I'm not really sure how
that's done in Bioperl but I think you use Range objects).

HTH

-Brian.




>Thanks for the quick replies, (sorry I haven't got back sooner
>but I went
>home early on Friday and tried to not even think about work ;)
>
>The line: $hsp->hsp_length('total')  works.
>This allows me to compare the Hit to a value that I am
>expecting to see.
>And I was was able to create another bit of info that I'll
>want using the
>following:
>my @hsp_s = $hit->hsps();
>my $lenAll = $hit->length('total');
>foreach my $hsp (@hsp_s){
>	$hsps_len_test += $hsp->length();
>}
>print "Ratio of hsps to hit: $hsps_len_test/$lenAll\n";
>
>OR, is there a better way to retieve the ratio (percent) of
>the entire hit to
>the query.
>Thanks again,
>Ken
>