[Bioperl-l] New hmmpfam parser
Chris Fields
cjfields at uiuc.edu
Mon Aug 21 00:25:32 UTC 2006
Sendu,
Could you post the example file you used somewhere for testing?
Chris
On Aug 20, 2006, at 4:56 PM, Sendu Bala wrote:
> I've added a new hmmpfam parser to bioperl-live.
>
> You access it with Bio::SearchIO::new(-format => "hmmer_pull"). It
> uses
> the new Bio::PullParserI discussed in thread 'SearchIO speedup'.
>
> The major differences between it and the existing SearchIO plugin for
> hmmpfam reports (hmmer.pm) are speed, memory usage and how it deals
> with
> hits and hsps. hmmer.pm breaks Bio::Search::HitI API by having hit
> (model) name()s that are not unique within a ResultI. It also only
> ever
> has one domain per model. hmmer_pull.pm has unique model names and as
> many domains per model as there are in the file being parsed.
> hmmer_pull.pm also gives back more correct answers when you try to use
> the full variety of HitI, GenericHit, HSPI and GenericHSP methods.
>
>
> Speed tested on one example hmmpfam report of 441kb comparing hmmer.pm
> and hmmer_pull.pm:
> (memory usage was always ~1.8x less)
>
> # for the result for query sequence 'test5' (5th result of 10 in my
> # test dataset), just get the most significant domain of the most
> # significant model:
> # while ($result = $searchio->next_result) {
> # if ($result->query_name eq 'test5') {
> # $result->sort_hits(sub{#sort by significance});
> # $hit = $result->next_hit;
> # $hsp = $hit->hsp('best');
> # last;
> # }
> # }
> 23.5x faster
>
> # while ($result = $searchio->next_result) { # do nothing }
> 38x faster
>
> # while ($result = $searchio->next_result) {
> # while ($hit = $result->next_hit) {
> # while ($hsp = $hit->next_hsp) { # do nothing }
> 5.3x faster
>
> # while ($result = $searchio->next_result) {
> # while ($hit = $result->next_hit) {
> # while ($hsp = $hit->next_hsp) {
> # $fi = $hsp->frac_identical('query');
> # }
> (note that hmmer.pm returns the wrong answer for $fi: 0)
> 2.2x faster
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign
More information about the Bioperl-l
mailing list