[Bioperl-l] bp_genbank2gff3.pl
Eric Just
e-just at northwestern.edu
Fri Jan 26 17:08:49 UTC 2007
Hi there,
I am getting some strange results with bp_genbank2gff3.pl. I have a
source genbank file with mulitple records. I would like to have all
of my mRNA features parsed into
mRNA
CDS
exon
strutctures and the tRNA features parsed int
tRNA
exon
structures in the GFF3 file.
I am calling the script like this:
perl %xampp_root%/perl/bin/bp_genbank2gff3.pl --filter misc_feature
--filter repeat_region --nolump genbank_data/test.small.gb
Everything appears to run OK, no errors, however in my output I have
mysterious missing exon features. Most of the mRNAs get parsed as
mRNA/CDS/exon but some are missing one or more exon features. The
problem seems to get worse the more records there are in the genbank
source file.
For example, the following portion of the genbank file:
gene <5948..>6982
/locus_tag="4.t00046"
/Name="4.t00046"
mRNA 5948..6982
/db_xref="GI:56474408"
/locus_tag="4.t00046"
/codon_start=1
/protein_id="EAL51779.1"
/product="ubiquitin-conjugating enzyme, putative"
CDS 5948..6982
/locus_tag="4.t00046"
gets written as:
AAFB01000019 GenBank gene 5948 6982 . + . iD=4.t00046;locus_tag=4.t00046;Name=4.t00046
AAFB01000019 GenBank mRNA 5948 6982 . + . iD=4.t00046.t01;Parent=4.t00046;db_xref=GI:56474408;locus_tag=4.t00046;codon_start=1;protein_id=EAL51779.1;product=ubiquitin-conjugating
enzyme%2C putative
AAFB01000019 GenBank CDS 5948 6982 . + . Parent=4.t00046.t01;locus_tag=4.t00046
Whereas most of the other mRNA features have exon features. I notice
the same problem with tRNAs missing exon features.
When if I parse the single GenBank record, it works fine, it seems to
be a problem parsing a single file with multiple GenBank records.
Any idea what's going wrong or what I can do to help trouble shoot?
Attached is my source GenBank file.
Thanks a lot!
Eric
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.small.gb.gz
Type: application/x-gzip
Size: 175413 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070126/c545fa4e/attachment-0004.gz>
More information about the Bioperl-l
mailing list