[Bioperl-l] Bio::Restriction::IO issues

Chris Fields cjfields at uiuc.edu
Tue May 30 14:50:06 UTC 2006


Jason, Brian, et al,

I found several major issues with Bio::Restriction::IO (this popped up while
bug squashing).  In particular, the POD is pretty misleading.  It states
(directly from perldoc):

SYNOPSIS
        use Bio::Restriction::IO;

        $in  = Bio::Restriction::IO->new(-file => "inputfilename" ,
                                         -format => 'withrefm');
        $out = Bio::Restriction::IO->new(-file => ">outputfilename" ,
                                         -format => 'bairoch');
        my $res = $in->read; # a Bio::Restriction::EnzymeCollection
        $out->write($res);

      # or

      #    use Bio::Restriction::IO;
      #
      #    #input file format can be read from the file extension (dat|xml)
      #    $in  = Bio::Restriction::IO->newFh(-file => "inputfilename");
      #    $out = Bio::Restriction::IO->newFh('-format' => 'xml');
      #
      #    # World's shortest flat<->xml format converter:
      #    print $out $_ while <$in>;

So, I have found several problems with these modules.  I really hate to
criticize code here, as my own is pretty hacky, but I think these are things
to seriously mull over: 

1)	Note that, though some of the lines above are commented they are
still there in POD and thus present in perldoc/pod2html etc.  So, judging
from the above, it suggests using the script above should read in from one
format and write out to another (like SeqIO).  However, NONE of the current
write() methods are implemented for any of the IO modules (withref, base,
itype2, bairoch), so this does not happen as expected.  You get the nasty
thrown 'method not implemented error' instead when writing.
2)	The commented statements in POD above also suggest that REBASE XML
format is supported when there is no XML module.  
3)	The Bio::Restriction::IO::bairoch module had multiple bugs which
made it unusable until I added a few small changes; it still can't handle
multisite/multicut enzymes properly, so in essence it is useless until that
is addressed.
4)	Bio::Restriction::IO inherits from Bio::SeqIO, though I'm not sure
why.  Shouldn't it just inherit from Bio::Root::Root/Bio::Root::IO and make
up it's own methods?  

I'm working on at least getting the 'bairoch' input format up and running
(so at least it gets the enzymes into a
Bio::Restriction::Enzyme::Collection).  From this point I'm not sure where
to proceed.  The POD obviously needs to be corrected to reflect that writing
formats is not implemented (and the bit about XML should be taken out
completely); that's the easy part which I am working on and plan committing
today.  However, these modules don't seem to be used too frequently so I'm
not sure whether it's worth spending too much time getting these up to speed
at the moment (adding write methods, switching to Bio::Root::Root, etc); I
have other priorities at the moment (including a way overdue ListSummary).
I'm also not sure who else is (using|working) on these so I don't want to
(make too many changes|step on someone else's toes), but these are, IMHO,
pretty serious problems.  

Any thoughts?

Chris


Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 






More information about the Bioperl-l mailing list