[Bioperl-l] Parsing "PCR_primers" tag from GenBank file

Horacio Montenegro h.montenegro at gmail.com
Mon Jun 15 02:56:25 UTC 2015


    Hi Roy,

    thanks, with your hints i was really able to retrieve the
information I want, the way I want. Sadly, also due to your hints, I
also discovered these records are so messy I will have to review them
all by hand anyway - better than proceeding with wrong data anyway.

    thanks agains, cheers, Horacio

On Sun, Jun 14, 2015 at 6:24 PM, Roy Chaudhuri <roy.chaudhuri at gmail.com> wrote:
> Hi Horacio,
>
> The two "satellite" tags in GQ344853 are in different features, hence they
> are separated out in your code, whereas the PCR_primers tags are both in the
> same feature (source). get_tag_values (and get_tagset_values, which is
> similar but doesn't throw an error if the tag isn't found) return an array
> if there are several of the specified tag in the feature, so you need to
> loop over that array if you want to separate them out. Your original code
> just passed the array value directly to print, so they were printed out one
> after the other.
>
> Here's a modification of your code which should be closer to what you want:
>
> #!/usr/bin/env perl
> use strict;
> use warnings FATAL=>qw(all);
> use Bio::SeqIO;
> my $seqio_object = Bio::SeqIO->new(-file => 'GQ344853.gb' );
> while (my $seq = $seqio_object->next_seq) {
>     print $seq->primary_id, "\t", $seq->length, "\n";
>     for my $feat_object ($seq->get_SeqFeatures) {
>         for my $tag (qw(satellite PCR_primers)) {
>              for my $value ($feat_object->get_tagset_values($tag)) {
>                   print "$tag\t$value\n";
>              }
>         }
>     }
> }
>
> Cheers,
> Roy.
>


More information about the Bioperl-l mailing list