[Bioperl-l] Bio::SeqIO::new possible wierdness

Peter van Heusden pvh at egenetics.com
Wed Jan 28 15:45:29 EST 2004

Jason Stajich wrote:

>On Wed, 28 Jan 2004, Donald G. Jackson wrote:
>>Personally, I like the fall-back but agree that $ARGV[0] shouldn't be it.
>>I'd suggest STDIN - if somebody calls new without a file/handle I think
>>they're more likely to be reading.  OTOH, guessing format woud be tough.
>the guess format is trying to read off the top of the file I think - we
>support a 'peek' type of reading into the file, by having the _pushback
>functionality in Root::IO.
>I would like to see something like this go into Root:IO rather than in
>SeqIO - and have Root::IO give back a filename if it knows what it is.
>Also the Root::IO code could also do something like this:
> $file = "-" unless defined $file;
> open my $fh => $input or die $!;
>Which will then read from stdin if now filename is sent in - right now we
>don't really support that anymore because it was causing clog-ups in some
>of the DB::GFF code/tests I think.
>Maybe we localize this to 'FormattedReaderWriters' -- all the
>XXXIO(-format => 'XXX') modules so as to avoid the problems Lincoln saw.
Can you to where Lincoln "saw" this problem? The BioPerl mailing list 
archive is not searchable, and searching via Google doesn't turn 
anything up.

Anyway, I'll look into Root::IO tomorrow and see what I come up with.


>>At the very least a warning would be appropriate, perhaps indicating the
>>course of action.
>>For xml handlers we can check the dtd and throw an error.  I will modify
>>my SeqIO::tinyseq::tinyseqHandler to do so.
>>Don Jackson
>>Peter van Heusden wrote:
>>>My review of the Bio::SeqIO::new method shows the following behaviour:
>>>Missing both ?file and ?fh arguments: falls back to using $ARGV[0]
>>>(the first command line argument) as sequence filename. If this fails,
>>>gives an exception about ?Unknown format?.
>>>-file argument (without ?fh argument):
>>>? given, but file unreadable: throws exception
>>>? undefined: reads $ARGV[0], as above.
>>>-fh argument (without ?file argument):
>>>? given, but not a filehandle: gives exception
>>>? given, but an invalid filehandle (not open): gives exception
>>>? undefined: reads $ARGV[0], as above.
>>>-format argument: if the sequence file doesn?t correspond to the given
>>>format, some parsers give an error (e.g. EMBL), while others do not
>>>(GenBank), instead silently give wrong results.
>>>-format argument without ?file argument: Silently creates a SeqIO
>>>object which writes to STDOUT.
>>>I don't think that this $ARGV[0] shortcut should be in there - it
>>>causes unnecessary potential confusion. Imagine a situation where -fh
>>>or -file is specified (using a variable), but that variable somehow
>>>does not get defined. In that case, the $ARGV[0] fallback behaviour
>>>would be used, which might lead to a non-obvious error behaviour.
>>>I'd like to propose that either -file or -fh should be specified,
>>>otherwise an exception is thrown. While I'm about it, I'm thinking of
>>>migrating the exceptions to the new 'typed exceptions' that BioPerl
>>>now provides - is there any consensus on exception type names?
>>>Bioperl-l mailing list
>>>Bioperl-l at portal.open-bio.org
>>Bioperl-l mailing list
>>Bioperl-l at portal.open-bio.org
>Jason Stajich
>Duke University
>jason at cgt.mc.duke.edu
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org

More information about the Bioperl-l mailing list