[Bioperl-l] Error parsing Genbank file
Ryan Golhar
golharam at umdnj.edu
Wed Jan 5 15:41:33 EST 2005
Hi all,
I have a Genbank file that Bio::SeqIO:genbank.pm is choking on. The
entry is just a WGS entry referencing a bunch of other entries. It does
on line 492 with the error "Unexpected error in feature table for
Skipping feature, attempting to recover".
I'm using the following code:
#!/usr/bin/perl
use strict;
use Bio::SeqIO;
my $usage = "$0 <genbank file> <fasta file>\n";
my $file = shift or die $usage;
my $outfilename = shift or die $usage;
my $infile = Bio::SeqIO->new('-file' => "<$file",
'-format' => "genbank");
my $outfile = Bio::SeqIO->new(-'file' => ">$outfilename",
'-format' => "fasta");
while (my $seq = $infile->next_seq) {
# print STDERR $seq->accession_number,"\n";
$outfile->write_seq($seq);
}
Here is the contents of the genbank entry:
LOCUS CAAB01000000 12381 rc DNA linear VRT
22-AUG-2002
DEFINITION Takifugu rubripes whole genome shotgun sequencing project.
ACCESSION CAAB00000000
VERSION CAAB00000000.1 GI:22418063
KEYWORDS WGS.
SOURCE Takifugu rubripes (Fugu rubripes)
ORGANISM Takifugu rubripes
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata;
Euteleostomi;
Actinopterygii; Neopterygii; Teleostei; Euteleostei;
Neoteleostei;
Acanthomorpha; Acanthopterygii; Percomorpha;
Tetraodontiformes;
Tetradontoidea; Tetraodontidae; Takifugu.
REFERENCE 1 (bases 1 to 12381)
AUTHORS The Fugu Genome Sequencing Consortium.
TITLE Direct Submission
JOURNAL Submitted (01-JUL-2002) The Fugu Genome Sequencing
Consortium,
http://www.fugubase.org/ http://www.jgi.doe.gov/fugu
COMMENT The Takifugu rubripes whole genome shotgun (WGS) project has
the
project accession CAAB00000000. This version of the project
(01)
has the accession number CAAB01000000, and consists of
sequences
CAAB01000001-CAAB01012381.
FEATURES Location/Qualifiers
source 1..12381
/organism="Takifugu rubripes"
/mol_type="genomic DNA"
/db_xref="taxon:31033"
WGS CAAB01000001-CAAB01012381
//
-----
Ryan Golhar
Computational Biologist
The Informatics Institute at
The University of Medicine & Dentistry of NJ
Phone: 973-972-5034
Fax: 973-972-7412
Email: golharam at umdnj.edu
More information about the Bioperl-l
mailing list