[Bioperl-l] using Bio::SeqIO to convert from table to genbank format ..... attribute_map example

Fields, Christopher J cjfields at illinois.edu
Sun Sep 13 04:11:51 UTC 2015


Hi Malcolm,

Best thing would be to have a dummy example for expected input and output so it can be tested against, just to make sure things work as expected.  Could you supply that?  Certainly seems like it should be feasible.

chris

> On Sep 12, 2015, at 12:16 AM, Cook, Malcolm <MEC at stowers.org> wrote:
> 
> Fellow long-time BioPerlers,
> 
> I am using Bio::SeqIO with success to convert between table (c.f.  http://search.cpan.org/~cjfields/BioPerl/Bio/SeqIO/table.pm) and genbank flatfile format.
> 
> I have Bio::SeqIO sequence format conversion wrapped in a command-line script.  The script exposes to the command line the parameters to ->new for both input and output objects through judicious use of GetOptions.  I have used this script in many conversion tasks between many different formats.
> 
> ... except now ...
> 
> I am having trouble with reading the flatfile format.
> 
> Happily, at first, I see that -display_id and -accession_number are both parameters to Bio::SeqIO::table->new.  So they are naturally exposed to the command line as `in format=table header=1 display_id=1 seq=3"
> 
> Alas however -description is not a parameter to ->new.
> 
> The only way I can see to configure table.pm to take the sequence description (aka desc) from the 2nd column of my .tab file is as follows:
> 
> 	$in->attribute_map({-description => 2});
> 
> ... however my trace shows me that even though this does work to set the desc attribute of the wrapped Bio::Primary_seq to the value from column 2, unfortunately using the attribute_map also removes the individual values passed in for -display_id and -accession_number
> 
> Ideally (I think) Bio::SeqIO::table->new  would take a -description=2 instead of having to call attribute_map.  
> 
> Or, Bio::SeqIO::table->new  would take  -attribute_map and even accept it as a string which gets evaluated to a hash reference, just as I see -colnames can be passed as a string evaling to an array (which I see in the unit test: http://cpansearch.perl.org/src/CJFIELDS/BioPerl-1.6.924/t/SeqIO/table.t).  This would allow the hash to be supplied at the command line.
> 
> Or, am I missing something?
> 
> FWIW: I am trying to help a lab convert a few years of plasmids from DNAPlasmid to Genbank (for load into Vector NTI) and I am passing through Bio::SeqiO::table in-so-diong.....
> 
> Cheers, and Thanks for help and suggestions....
> 
> Malcolm Cook
> Stowers Institute for Medical Research
> 1000 E 50th Street
> Kanas City, MO 64110
> (816) 926-4449
> mec at stowers.org
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list