[Bioperl-l] Problem with Unflattener

Scott Cain cain at cshl.org
Tue Dec 9 15:00:34 EST 2003


Hello Chris,

I am using Unflattener to create a genbank2gff script that is more
robust than what we have now.  As one of my example Genbank files, I am
using an A. gambiae chromosome:

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=nucleotide&list_uids=31249389&dopt=GenBank&term=NW_045730&qty=1

When I try to run the simplified script below, I get the following
error:

------------- EXCEPTION  -------------
MSG: structure_type 2 is currently unknown
STACK Bio::SeqFeature::Tools::Unflattener::unflatten_seq /usr/local/lib/perl5/site_perl/5.8.1/Bio/SeqFeature/Tools/Unflattener.pm:1345
STACK toplevel ./simple.pl:19
 
--------------------------------------

As I read Unflattener, structure_type should only be set if I set it
explicitly, right?  So how is it getting set here, and how do I make it
stop?

Here's the script:
#!/usr/bin/perl -w
use strict;
use Bio::SeqIO;
use Bio::SeqFeature::Tools::Unflattener;
                                                                                              
my $unflattener = Bio::SeqFeature::Tools::Unflattener->new;
                                                                                              
my $seqio = Bio::SeqIO->new(
    -file   => 'NW_045730.1.gbk',
    -format => 'GenBank'
);
                                                                                              
open OUT, '>out.gff';
                                                                                              
while ( my $seq = $seqio->next_seq() ) {
    my $acc = $seq->accession;
                                                                                              
    # get top level unflattended SeqFeatureI objects
    my @sfs = $unflattener->unflatten_seq(
        -seq       => $seq,
        -use_magic => 1
    );
                                                                                              
    foreach my $sf (@sfs) {
        my $gffio =
          $sf->gff_format( Bio::Tools::GFF->new( -gff_version => 3 ) );
                                                                                              
        $sf->seq_id($acc);
                                                                                              
        if ( $sf->primary_tag() eq 'source' ) {
            $sf->add_tag_value( 'ID', $acc );
            $sf->primary_tag('region');
        }
        print OUT $sf->gff_string . "\n";
    }
}
close OUT;
---------------------------

Thanks,
Scott

-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.org
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory



More information about the Bioperl-l mailing list