[Bioperl-l] Bio::SeqIO::genbank seems to munge SOURCE and ORGANISM
lines
Geoff Purdy
geoff_purdy at yahoo.com
Thu Feb 20 12:53:31 EST 2003
I've noticed some more strange behavior with reading
and printing GenBank files using Bio::SeqIO.
It appears that the SOURCE and ORGANISM lines are
being corrupted by BioPerl in some records. Below is
an example using GenBank accession AE016800 (for ease
of reading, I've removed the uneffected surounding
lines):
>From the original genbank file:
SOURCE Vibrio vulnificus CMCP6
ORGANISM Vibrio vulnificus CMCP6
>From the output after reading in an printing out with
Bio::SeqIO:
SOURCE Vibrio vulnificus CMCP6 CMCP6.
ORGANISM Vibrio vulnificus
You can see that CMCP6 was dropped from the ORGANISM
and appended (along with a mystery period) to the
SOURCE.
Is this a known issue, or should I submit it to
bugzilla?
Here is the code snippet that I used for reading the
file and writing it back out:
my $in = Bio::SeqIO->newFh(-file => "AE016800.gb" ,
'-format' => 'Genbank');
my $out = Bio::SeqIO->newFh('-format' => 'Genbank');
print $out $_ while <$in>;
__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, more
http://taxes.yahoo.com/
More information about the Bioperl-l
mailing list