[Bioperl-l] using Bio::SeqIO to convert from table to genbank format ..... attribute_map example

Cook, Malcolm MEC at stowers.org
Mon Sep 14 14:44:28 UTC 2015


Hi Chris, Brian, Hillmar, et. al.,

Thanks for offering to consider this change.

Attached is a test.tab and converted test.tab.gb

test.tab has three columns, n (display_id) d (definition/description) s (sequence)

test.tab.gb has what I would hope would result from writing in genbank format after reading using:

	Bio::SeqIO->new(-file => $filename, -format => 'table'. -header=1, -display_id=1 ,-accession_number=1, -seq=3, -desc=2)


You may be additionally interested in the following:  
After preparing this data, I tried to round-trip it, and found the following error when trying to convert test.tab.gb back to table format:

perl -M'Bio::SeqIO'  -e '$out = Bio::SeqIO->new(-format => qq{table}); $in = Bio::SeqIO->new(-format => qq{genbank},-file=>"test.tab.gb");  while ( my $seq = $in->next_seq() ) {$out->write_seq($seq) }'  > test.tab.gb.tab

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Sorry, you cannot write to a generic Bio::SeqIO object.
STACK: Error::throw
STACK: Bio::Root::Root::throw /n/local/stage/perlbrew/perlbrew-0.43/perls/perl-5.16.1t/lib/site_perl/5.16.1/Bio/Root/Root.pm:486
STACK: Bio::SeqIO::write_seq /n/local/stage/perlbrew/perlbrew-0.43/perls/perl-5.16.1t/lib/site_perl/5.16.1/Bio/SeqIO.pm:540
STACK: -e:1

Any help much appreciated.  I do have a workaround for now, but it is a kludge....

Cheers,

Malcolm

 > -----Original Message-----
 > From: Fields, Christopher J [mailto:cjfields at illinois.edu]
 > Sent: Saturday, September 12, 2015 11:12 PM
 > To: Cook, Malcolm <MEC at stowers.org>
 > Cc: bioperl-l at mailman.open-bio.org; Hilmar Lapp <hlapp at gmx.net>
 > Subject: Re: [Bioperl-l] using Bio::SeqIO to convert from table to genbank
 > format ..... attribute_map example
 > 
 > Hi Malcolm,
 > 
 > Best thing would be to have a dummy example for expected input and output
 > so it can be tested against, just to make sure things work as expected.  Could
 > you supply that?  Certainly seems like it should be feasible.
 > 
 > chris
 > 
 > > On Sep 12, 2015, at 12:16 AM, Cook, Malcolm <MEC at stowers.org> wrote:
 > >
 > > Fellow long-time BioPerlers,
 > >
 > > I am using Bio::SeqIO with success to convert between table (c.f.
 > http://search.cpan.org/~cjfields/BioPerl/Bio/SeqIO/table.pm) and genbank
 > flatfile format.
 > >
 > > I have Bio::SeqIO sequence format conversion wrapped in a command-line
 > script.  The script exposes to the command line the parameters to ->new for
 > both input and output objects through judicious use of GetOptions.  I have used
 > this script in many conversion tasks between many different formats.
 > >
 > > ... except now ...
 > >
 > > I am having trouble with reading the flatfile format.
 > >
 > > Happily, at first, I see that -display_id and -accession_number are both
 > parameters to Bio::SeqIO::table->new.  So they are naturally exposed to the
 > command line as `in format=table header=1 display_id=1 seq=3"
 > >
 > > Alas however -description is not a parameter to ->new.
 > >
 > > The only way I can see to configure table.pm to take the sequence
 > description (aka desc) from the 2nd column of my .tab file is as follows:
 > >
 > > 	$in->attribute_map({-description => 2});
 > >
 > > ... however my trace shows me that even though this does work to set the
 > desc attribute of the wrapped Bio::Primary_seq to the value from column 2,
 > unfortunately using the attribute_map also removes the individual values
 > passed in for -display_id and -accession_number
 > >
 > > Ideally (I think) Bio::SeqIO::table->new  would take a -description=2 instead
 > of having to call attribute_map.
 > >
 > > Or, Bio::SeqIO::table->new  would take  -attribute_map and even accept it as
 > a string which gets evaluated to a hash reference, just as I see -colnames can
 > be passed as a string evaling to an array (which I see in the unit test:
 > http://cpansearch.perl.org/src/CJFIELDS/BioPerl-1.6.924/t/SeqIO/table.t).  This
 > would allow the hash to be supplied at the command line.
 > >
 > > Or, am I missing something?
 > >
 > > FWIW: I am trying to help a lab convert a few years of plasmids from
 > DNAPlasmid to Genbank (for load into Vector NTI) and I am passing through
 > Bio::SeqiO::table in-so-diong.....
 > >
 > > Cheers, and Thanks for help and suggestions....
 > >
 > > Malcolm Cook
 > > Stowers Institute for Medical Research
 > > 1000 E 50th Street
 > > Kanas City, MO 64110
 > > (816) 926-4449
 > > mec at stowers.org
 > >
 > >
 > > _______________________________________________
 > > Bioperl-l mailing list
 > > Bioperl-l at mailman.open-bio.org
 > > http://mailman.open-bio.org/mailman/listinfo/bioperl-l

-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.tab.gb
Type: application/octet-stream
Size: 370 bytes
Desc: test.tab.gb
URL: <http://mailman.open-bio.org/pipermail/bioperl-l/attachments/20150914/a0cf8ef7/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.tab
Type: application/octet-stream
Size: 28 bytes
Desc: test.tab
URL: <http://mailman.open-bio.org/pipermail/bioperl-l/attachments/20150914/a0cf8ef7/attachment-0001.obj>


More information about the Bioperl-l mailing list