[Bioperl-l] Bio::SeqIO::new possible wierdness

Jason Stajich jason at cgt.duhs.duke.edu
Fri Jan 30 09:52:09 EST 2004


I dunnno then - Chris has graciously set it up, either ht://dig is not
doing its job very well or there something mis-configured. We Have tried
to make the lists searchable at http://search.open-bio.org/ if it isn't
working properly that is another issue.  google +
site:open-bio.org pipermail bioperl-l your-term
also works pretty well.

It really is a major job making sure all of the website/cvs/server
components work correctly all the time.  I wish there was a way to give
Chris more of hand on these things, as he has a full-time consulting gig
to keep him around in the first place.

--jason

On Wed, 28 Jan 2004, Brian Osborne wrote:

> Jason,
>
> I'm a bit suspicious of search.open-bio.org. I enter a term like 'Root' or
> 'GFF' and get back a dozen hits or so. It's inconceivable to me that there's
> only 12 messages in bioperl-l since 1999 containing the string 'GFF'.
> Something's wrong, either with the search or the display. And if there are
> no matches I see only a blank page, which is a bit inscrutable. Then if I
> select 'no restriction', which I guess means everything in the selectable
> list I don't see the Bioperl matches anymore, I just see a dozen or so
> Biojava matches.
>
> Brian O.
>
> -----Original Message-----
> From: bioperl-l-bounces at portal.open-bio.org
> [mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Jason Stajich
> Sent: Wednesday, January 28, 2004 4:34 PM
> To: Peter van Heusden
> Cc: bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Bio::SeqIO::new possible wierdness
>
> The bioperl list is searchable - just not the bioperl-guts though -
> http://search.open-bio.org
> and/or google works fine for me
>
>
> This is the change Lincoln made though (cvs log on Bio/Root/IO.pm
> and found the last commit by lincoln).  I had put the \*ARGV in there so
> that we could use the magic <> operator (allows STDIN or a list of files
> to all be used as transparent input).  This caused some problems with
> tests in GFF, SeqFeature, or Registry.
>
> Here is his log message
> revision 1.50
> date: 2003/11/21 03:03:38;  author: lstein;  state: Exp;  lines: +2 -2
> The following regression tests now pass: GFF, SeqFeature, Registry
>
> --jason
>
> jason at jason $ cvs diff -r 1.49 Bio/Root/IO.pm
> Index: Bio/Root/IO.pm
> ===================================================================
> RCS file: /home/repository/bioperl/bioperl-live/Bio/Root/IO.pm,v
> retrieving revision 1.49
> diff -r1.49 IO.pm
> 1c1
> < # $Id: IO.pm,v 1.49 2003/10/28 21:58:54 jason Exp $
> ---
> > # $Id: IO.pm,v 1.50 2003/11/21 03:03:38 lstein Exp $
> 435c435
> <     my $fh = $self->_fh || \*ARGV;
> ---
> >     my $fh = $self->_fh or return;
>
>
> On Wed, 28 Jan 2004, Peter van Heusden wrote:
>
> > Jason Stajich wrote:
> >
> > >On Wed, 28 Jan 2004, Donald G. Jackson wrote:
> > >
> > >
> > >
> > >>Personally, I like the fall-back but agree that $ARGV[0] shouldn't be
> it.
> > >>I'd suggest STDIN - if somebody calls new without a file/handle I think
> > >>they're more likely to be reading.  OTOH, guessing format woud be tough.
> > >>
> > >>
> > >
> > >the guess format is trying to read off the top of the file I think - we
> > >support a 'peek' type of reading into the file, by having the _pushback
> > >functionality in Root::IO.
> > >
> > >I would like to see something like this go into Root:IO rather than in
> > >SeqIO - and have Root::IO give back a filename if it knows what it is.
> > >
> > >Also the Root::IO code could also do something like this:
> > > $file = "-" unless defined $file;
> > > open my $fh => $input or die $!;
> > >
> > >Which will then read from stdin if now filename is sent in - right now we
> > >don't really support that anymore because it was causing clog-ups in some
> > >of the DB::GFF code/tests I think.
> > >
> > >Maybe we localize this to 'FormattedReaderWriters' -- all the
> > >XXXIO(-format => 'XXX') modules so as to avoid the problems Lincoln saw.
> > >
> > >
> > >
> > >
> > Can you to where Lincoln "saw" this problem? The BioPerl mailing list
> > archive is not searchable, and searching via Google doesn't turn
> > anything up.
> >
> > Anyway, I'll look into Root::IO tomorrow and see what I come up with.
> >
> > Peter
> >
> > >
> > >
> > >>At the very least a warning would be appropriate, perhaps indicating the
> > >>course of action.
> > >>
> > >>For xml handlers we can check the dtd and throw an error.  I will modify
> > >>my SeqIO::tinyseq::tinyseqHandler to do so.
> > >>
> > >>Don Jackson
> > >>
> > >>
> > >>
> > >>Peter van Heusden wrote:
> > >>
> > >>
> > >>
> > >>>My review of the Bio::SeqIO::new method shows the following behaviour:
> > >>>
> > >>>Missing both ?file and ?fh arguments: falls back to using $ARGV[0]
> > >>>(the first command line argument) as sequence filename. If this fails,
> > >>>gives an exception about ?Unknown format?.
> > >>>-file argument (without ?fh argument):
> > >>>? given, but file unreadable: throws exception
> > >>>? undefined: reads $ARGV[0], as above.
> > >>>-fh argument (without ?file argument):
> > >>>? given, but not a filehandle: gives exception
> > >>>? given, but an invalid filehandle (not open): gives exception
> > >>>? undefined: reads $ARGV[0], as above.
> > >>>-format argument: if the sequence file doesn?t correspond to the given
> > >>>format, some parsers give an error (e.g. EMBL), while others do not
> > >>>(GenBank), instead silently give wrong results.
> > >>>-format argument without ?file argument: Silently creates a SeqIO
> > >>>object which writes to STDOUT.
> > >>>
> > >>>I don't think that this $ARGV[0] shortcut should be in there - it
> > >>>causes unnecessary potential confusion. Imagine a situation where -fh
> > >>>or -file is specified (using a variable), but that variable somehow
> > >>>does not get defined. In that case, the $ARGV[0] fallback behaviour
> > >>>would be used, which might lead to a non-obvious error behaviour.
> > >>>
> > >>>I'd like to propose that either -file or -fh should be specified,
> > >>>otherwise an exception is thrown. While I'm about it, I'm thinking of
> > >>>migrating the exceptions to the new 'typed exceptions' that BioPerl
> > >>>now provides - is there any consensus on exception type names?
> > >>>
> > >>>Peter
> > >>>_______________________________________________
> > >>>Bioperl-l mailing list
> > >>>Bioperl-l at portal.open-bio.org
> > >>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >>>
> > >>>
> > >>>
> > >>_______________________________________________
> > >>Bioperl-l mailing list
> > >>Bioperl-l at portal.open-bio.org
> > >>http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >>
> > >>
> > >>
> > >
> > >--
> > >Jason Stajich
> > >Duke University
> > >jason at cgt.mc.duke.edu
> > >
> > >_______________________________________________
> > >Bioperl-l mailing list
> > >Bioperl-l at portal.open-bio.org
> > >http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
>
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


More information about the Bioperl-l mailing list