[Bioperl-l] Bio::SeqIO::new possible wierdness
Jason Stajich
jason at cgt.duhs.duke.edu
Wed Jan 28 14:51:42 EST 2004
On Wed, 28 Jan 2004, Donald G. Jackson wrote:
> Personally, I like the fall-back but agree that $ARGV[0] shouldn't be it.
> I'd suggest STDIN - if somebody calls new without a file/handle I think
> they're more likely to be reading. OTOH, guessing format woud be tough.
the guess format is trying to read off the top of the file I think - we
support a 'peek' type of reading into the file, by having the _pushback
functionality in Root::IO.
I would like to see something like this go into Root:IO rather than in
SeqIO - and have Root::IO give back a filename if it knows what it is.
Also the Root::IO code could also do something like this:
$file = "-" unless defined $file;
open my $fh => $input or die $!;
Which will then read from stdin if now filename is sent in - right now we
don't really support that anymore because it was causing clog-ups in some
of the DB::GFF code/tests I think.
Maybe we localize this to 'FormattedReaderWriters' -- all the
XXXIO(-format => 'XXX') modules so as to avoid the problems Lincoln saw.
> At the very least a warning would be appropriate, perhaps indicating the
> course of action.
>
> For xml handlers we can check the dtd and throw an error. I will modify
> my SeqIO::tinyseq::tinyseqHandler to do so.
>
> Don Jackson
>
>
>
> Peter van Heusden wrote:
>
> > My review of the Bio::SeqIO::new method shows the following behaviour:
> >
> > Missing both file and fh arguments: falls back to using $ARGV[0]
> > (the first command line argument) as sequence filename. If this fails,
> > gives an exception about Unknown format.
> > -file argument (without fh argument):
> > · given, but file unreadable: throws exception
> > · undefined: reads $ARGV[0], as above.
> > -fh argument (without file argument):
> > · given, but not a filehandle: gives exception
> > · given, but an invalid filehandle (not open): gives exception
> > · undefined: reads $ARGV[0], as above.
> > -format argument: if the sequence file doesnt correspond to the given
> > format, some parsers give an error (e.g. EMBL), while others do not
> > (GenBank), instead silently give wrong results.
> > -format argument without file argument: Silently creates a SeqIO
> > object which writes to STDOUT.
> >
> > I don't think that this $ARGV[0] shortcut should be in there - it
> > causes unnecessary potential confusion. Imagine a situation where -fh
> > or -file is specified (using a variable), but that variable somehow
> > does not get defined. In that case, the $ARGV[0] fallback behaviour
> > would be used, which might lead to a non-obvious error behaviour.
> >
> > I'd like to propose that either -file or -fh should be specified,
> > otherwise an exception is thrown. While I'm about it, I'm thinking of
> > migrating the exceptions to the new 'typed exceptions' that BioPerl
> > now provides - is there any consensus on exception type names?
> >
> > Peter
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
More information about the Bioperl-l
mailing list