[Bioperl-l] Odd problem with get_tag_values

Roy Chaudhuri roy.chaudhuri at gmail.com
Mon Feb 27 10:33:48 UTC 2012


Just to chip in on this, you can use get_tagset_values instead of 
get_tag_values - the former has (to me) the more Perl-ish behaviour of 
returning an empty list if there are none of the requested tags present, 
meaning that you can skip the has_tag step.

Cheers,
Roy.

On 24/02/2012 21:57, Adlai Burman wrote:
>
> On Feb 24, 2012, at 10:46 PM, Fields, Christopher J wrote:
>
>> Using has_tag('gene') as a pre-screen works for me for both example
>> seqs.
>>
>
> Me too :-)
>
> Dobrou noc and cheers,
>
> Adlai
>> chris
>>
>> On Feb 24, 2012, at 3:33 PM, Adlai Burman wrote:
>>
>>> Thanks so much, Jason. I will give that a try in after I get a
>>> few hours of much needed sleep :-)
>>>
>>>
>>> On Feb 24, 2012, at 10:21 PM, Jason Stajich wrote:
>>>
>>>> not all CDS will be annotated with a 'gene' tag, this is due to
>>>> variation in how annotation is done and that there is not a
>>>> requirement that there be a gene tag for all CDS features.
>>>>
>>>> You can protect your query - we often do this when dealing with
>>>> data from the wild by testing for has_tag first.
>>>>
>>>> my %strands; for my $cds ( grep {$_->primary_tag eq 'CDS' }
>>>> Bio::SeqIO->new(-file =>  $file)->next_seq->get_SeqFeatures )
>>>> { if( $cds->has_tag('gene') ) { my ($gene) =
>>>> $cds->get_tag_values('gene'); # get the 1st one, this returns a
>>>> list $strands{$gene} = $cds->strand; } else { # look in
>>>> alternative places for a name, e.g. locus, ... } }
>>>>
>>>> An alternative is to loop through your list of tags in order of
>>>> preference
>>>>
>>>> my %strands; for my $cds ( grep {$_->primary_tag eq 'CDS' }
>>>> Bio::SeqIO->new(-file =>  $file)->next_seq->get_SeqFeatures )
>>>> { for my $tag ( qw(gene locus name product accession note) ) {
>>>> if( $cds->has_tag($tag) ) { my ($name) =
>>>> $cds->get_tag_values($tag); # get the 1st one, this returns a
>>>> list $strands{$name} = $cds->strand; $seen = 1; last; } if( !
>>>> $seen ) { warn("not tag found for feature at ",
>>>> $cds->location->to_FTstring, "\n"); } }
>>>>
>>>> On Feb 24, 2012, at 12:43 PM, Adlai Burman wrote:
>>>>
>>>>> I have come across a perplexing problem with trying to parse
>>>>> sequence features into hashes from gb records. This is the
>>>>> minimal code which shows my problem:
>>>>>
>>>>> #!/usr/bin/perl use strict; use warnings; use IO::String; use
>>>>> Bio::Perl; use Bio::SeqIO; use IO::String;
>>>>>
>>>>> my @files =</Users/adlai/Dropbox/atrsh/*>; foreach my
>>>>> $file(@files){
>>>>>
>>>>>
>>>>> my @cds_features = grep {$_->primary_tag eq 'CDS' }
>>>>> Bio::SeqIO->new(-file =>  $file)->next_seq->get_SeqFeatures;
>>>>> my %strands = map {$_->get_tag_values('gene'), $_->strand}
>>>>> @cds_features; ##This Is The Culprit. . . . #do nifty stuff
>>>>> }
>>>>>
>>>>> For some files this approach works just fine. For others the
>>>>> script dies immediately with the error message:
>>>>>
>>>>> ------------- EXCEPTION ------------- MSG: asking for tag
>>>>> value that does not exist gene STACK
>>>>> Bio::SeqFeature::Generic::get_tag_values
>>>>> /Users/adlai/Downloads/BioPerl-1.6.1/Bio/SeqFeature/Generic.pm:517
>>>>>
>>>>>
STACK toplevel tosend.pl:16
>>>>> -------------------------------------
>>>>>
>>>>> The difference in the files that parse and those that don't
>>>>> seems to be that the files that crash have "intron" and
>>>>> "exon" tags. They ALL have "gene" tags. Does anyone know why
>>>>> this is a problem and what can be done to circumvent it?
>>>>>
>>>>> Thanks, Adlai
>>>>>
>>>>>
>>>>> _______________________________________________ Bioperl-l
>>>>> mailing list Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> Jason Stajich jason.stajich at gmail.com jason at bioperl.org
>>>>
>>>>
>>>
>>>
>>> _______________________________________________ Bioperl-l mailing
>>> list Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
>
> _______________________________________________ Bioperl-l mailing
> list Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list