[Bioperl-l] Aggressive aggregation?
Aaron J. Mackey
amackey at pcbi.upenn.edu
Mon Mar 14 12:39:26 EST 2005
In the "FWIW" category:
This is what I did to break the "aggressive aggregation" (attached
patch); it relies on the fact that when aggregation occurs, the base
feature's range always (at least in my use cases so far) contains (or at
least overlaps) the subfeature's ranges. So in the code below, when
more than one base feature is detected, then range checking kicks in.
This won't help you if, for instance, you're saving separate HSP linking
information as different hits (because the hits will still overlap), but
it does solve the more common case of one protein/EST matching in
multiple, distinct locations on the genome.
-Aaron
-------------- next part --------------
diff -u -r1.30 Aggregator.pm
--- Aggregator.pm 3 Aug 2004 09:17:23 -0000 1.30
+++ Aggregator.pm 14 Mar 2005 17:45:35 -0000
@@ -303,7 +303,7 @@
? join ($;,$feature->group,$feature->refseq,$feature->source)
: join ($;,$feature->group,$feature->refseq);
if ($main_method && lc $feature->method eq lc $main_method) {
- $aggregates{$key}{base} ||= $feature->clone;
+ push @{$aggregates{$key}{base}}, $feature->clone;
} else {
push @{$aggregates{$key}{subparts}},$feature;
}
@@ -321,18 +321,29 @@
if ($require_whole_object && $self->components) {
next unless $aggregates{$_}{base}; # && $aggregates{$_}{subparts};
}
- my $base = $aggregates{$_}{base};
+
+ my $base = shift @{$aggregates{$_}{base} || []};
unless ($base) { # no base, so create one
my $first = $aggregates{$_}{subparts}[0];
$base = $first->clone; # to inherit parent coordinate system, etc
$base->score(undef);
$base->phase(undef);
}
- $base->method($pseudo_method);
- $base->add_subfeature($_) foreach @{$aggregates{$_}{subparts}};
- $base->adjust_bounds;
- $base->compound(1); # set the compound flag
- push @result,$base;
+ while ($base) {
+ $base->method($pseudo_method);
+ if (@{$aggregates{$_}{base} || []}) {
+ # only capture those subfeatures that overlap the base
+ for my $part (@{$aggregates{$_{subparts}}}) {
+ $base->add_subfeature($part) if $part->overlaps($base, "strong");
+ }
+ } else {
+ $base->add_subfeature($_) foreach @{$aggregates{$_}{subparts}};
+ }
+ $base->adjust_bounds;
+ $base->compound(1); # set the compound flag
+ push @result,$base;
+ $base = shift @{$aggregates{$_}{base} || []}
+ }
}
@$features = @result;
}
More information about the Bioperl-l
mailing list