[Bioperl-l] Odd problem with get_tag_values

Fields, Christopher J cjfields at illinois.edu
Fri Feb 24 21:28:52 UTC 2012


On 02/24/2012 02:43 PM, Adlai Burman wrote:
> I have come across a perplexing problem with trying to parse sequence features into hashes from gb records. This is the minimal code which shows my problem:
> 
> #!/usr/bin/perl
> use strict;
> use warnings;
> use IO::String;
> use Bio::Perl;
> use Bio::SeqIO;
> use IO::String;
> 
> my @files =</Users/adlai/Dropbox/atrsh/*>;
> foreach my $file(@files){
> 
> 
> my @cds_features = grep {$_->primary_tag eq 'CDS' } Bio::SeqIO->new(-file =>  $file)->next_seq->get_SeqFeatures;
> my %strands = map {$_->get_tag_values('gene'), $_->strand} @cds_features; ##This Is The Culprit.
> .
> .
> .
> #do nifty stuff
> }
> 
> For some files this approach works just fine.
> For others the script dies immediately with the error message:
> 
> ------------- EXCEPTION -------------
> MSG: asking for tag value that does not exist gene
> STACK Bio::SeqFeature::Generic::get_tag_values /Users/adlai/Downloads/BioPerl-1.6.1/Bio/SeqFeature/Generic.pm:517
> STACK toplevel tosend.pl:16
> -------------------------------------

There are two possibilities:

1) There is at least one feature w/o a 'gene' tag for those files.
2) This is a bug.

Either way it's hard to tell b/c we don't have the example data you are checking.

I would note this is *not* the way to screen for features with specific tags, though, at least with the current API.  You have to actually check for the presence of the tag first with has_tag('gene').  You could do that within the grep:

{ $_->primary_tag eq 'CDS' && $_->has_tag('gene') }

> The difference in the files that parse and those that don't seems to be that the files that crash have "intron" and "exon" tags. They ALL have "gene" tags.
> Does anyone know why this is a problem and what can be done to circumvent it?
> 
> Thanks,
> Adlai

chris





More information about the Bioperl-l mailing list