[Bioperl-l] BioPerl help with 2D arrays
Chris Fields
cjfields at uiuc.edu
Tue Jun 24 15:39:47 UTC 2008
On Jun 23, 2008, at 2:32 PM, Nathan Haseley (RIT Student) wrote:
> Hello,
> I am writing a script that returns an array of array references
> to Bio::Search::HSP::GenericHSP objects (2D array of
> Bio::Search::HSP::GenericHSP objects). Whenever I try to call
> functions such as ->num_identical I get an error message:
> Can't call method "num_identical" without a package or object
> reference.
>
> Below are the segments of the code that are giving me problems to
> give you a general idea of what I'm doing. Is there a way around
> this? What am I doing wrong? Thanks!
> Sincerely,
> Nathan
>
> my $in = new Bio::SearchIO( -format => 'blast',
> -file => $file );
> my $ j = -1;
> while( $result = $in->next_result and ref($result)) {
$result should always be assigned an object ref or undef, the latter
which kills the loop, so '... and ref($result)' isn't needed.
> ++$j;
> while( $has_species == 0) {
> if( my $hit = $result->next_hit) {
> if( $hit -> description =~ /$species_names[$i]/i) {
> $has_species = 1;
> $temp[$i] = $hit->next_hsp;
> $result->rewind;
> }
> }
> }
I think you want to incorporate a simple hash construct for your
species names. However...
You have left out a significant bit of code, so I am finding it hard
to determine what you are trying to accomplish. For instance, from
the above it seems you always want to always match to the same species
name, as $i never changes (so $species_name[$i] also never changes).
The while loop as represented above is pretty dangerous, as you can
feasibly enter an infinite loop unless you get both a hit AND the hit
desc matches the regex. Also, you're rewinding the main $result
iterator ($result->rewind) while inside the $result iterator loop; not
sure why you are doing that.
Note that '$hit->next_hsp;' can also be dangerous under some
circumstances, as this can give you undef in cases where no HSP
alignment is returned (e.g. the hit data is only in the hit table).
May be one source of your problems.
> $homologs[$j] = [@temp];
...
> return $homologs
Shouldn't that be @homologs? You use '$homologs[$j]' above, which is
a scalar in the array @homologs.
If you aren't 'use strict/warnings', this creates a local scalar
$homologs then returns the value of $homologs (undef), which could be
the problem. Using 'strict/warnings' should catch that.
> .
> .
> . (eventually in a separate fucntion)
> $temp = $homologs[$i]->[$j];
> $temp->num_identical;
I would go about this by creating a lookup table (hash or array) of
species names, then iterate through each BLAST report to either
(1) generically grab the species name generically and check it against
the lookup table using 'exists',
%lookup = map {$_ => 1} ('species1', 'species2');
# if species is in brackets, match inside the brackets
if ($hit->description =~ /\[([^\]]+)\]/) {
$sp = $1;
if (exists $lookup{$sp}) {
# do something
} else { # do something on failure }
}
or (2) check each name (the key in the lookup table) against the
description using a 'grep', such as:
%lookup = map {$_ => 1} ('species1', 'species2');
if (grep {$desc =~ /$_/} keys %lookup) {...} else {# do something on
failure}
# could use array of names as a lookup as well
Depending on how the varied the descriptions are; it may only be
feasible to try the latter one.
From there you could store the HSP data in a hash, using the species
name:
$temp{$species} = $hit->next_hsp || warn 'No HSP returned';
and store that by the report # or description:
# array of reports
push @reports, \%temp;
# hash of reports, by query name
$report{$result->query_name} = \%temp;
chris
More information about the Bioperl-l
mailing list