[Bioperl-l] Problem with Unflattener
Scott Cain
cain at cshl.org
Tue Dec 9 15:00:34 EST 2003
Hello Chris,
I am using Unflattener to create a genbank2gff script that is more
robust than what we have now. As one of my example Genbank files, I am
using an A. gambiae chromosome:
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=nucleotide&list_uids=31249389&dopt=GenBank&term=NW_045730&qty=1
When I try to run the simplified script below, I get the following
error:
------------- EXCEPTION -------------
MSG: structure_type 2 is currently unknown
STACK Bio::SeqFeature::Tools::Unflattener::unflatten_seq /usr/local/lib/perl5/site_perl/5.8.1/Bio/SeqFeature/Tools/Unflattener.pm:1345
STACK toplevel ./simple.pl:19
--------------------------------------
As I read Unflattener, structure_type should only be set if I set it
explicitly, right? So how is it getting set here, and how do I make it
stop?
Here's the script:
#!/usr/bin/perl -w
use strict;
use Bio::SeqIO;
use Bio::SeqFeature::Tools::Unflattener;
my $unflattener = Bio::SeqFeature::Tools::Unflattener->new;
my $seqio = Bio::SeqIO->new(
-file => 'NW_045730.1.gbk',
-format => 'GenBank'
);
open OUT, '>out.gff';
while ( my $seq = $seqio->next_seq() ) {
my $acc = $seq->accession;
# get top level unflattended SeqFeatureI objects
my @sfs = $unflattener->unflatten_seq(
-seq => $seq,
-use_magic => 1
);
foreach my $sf (@sfs) {
my $gffio =
$sf->gff_format( Bio::Tools::GFF->new( -gff_version => 3 ) );
$sf->seq_id($acc);
if ( $sf->primary_tag() eq 'source' ) {
$sf->add_tag_value( 'ID', $acc );
$sf->primary_tag('region');
}
print OUT $sf->gff_string . "\n";
}
}
close OUT;
---------------------------
Thanks,
Scott
--
------------------------------------------------------------------------
Scott Cain, Ph. D. cain at cshl.org
GMOD Coordinator (http://www.gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory
More information about the Bioperl-l
mailing list