[BioPerl] Re: [Bioperl-l] Bemusement with get_seq_by_gi in a
CGI script
Jason Stajich
jason at cgt.duhs.duke.edu
Fri Sep 12 10:43:57 EDT 2003
You might try setting verbose => -1 in your code which uses
Bio::DB::GenBank.
On Fri, 12 Sep 2003, Mark Wilkinson wrote:
> there is actually a similar problem somewhere else in the code. Even if
> you use retrievaltype => 'io_string', there are cases where it will fail
> with the same symptoms.
>
> If you try to do a get_Seq_by_acc using a RefSeq identifier (e.g.
> NC_003992) you get the following warning in your errorlog:
>
> -------------------- WARNING ---------------------
> MSG: [gb|NC_003992] is not a normal sequence database but a RefSeq
> entry. Redirecting the request.
>
> Unfortunately, somewhere in that redirection something is printed to
> STDOUT because the next message is:
>
>
> ---------------------------------------------------
> [Fri Sep 12 11:29:50 2003] [error] [client 24.78.208.156] malformed
> header from script. Bad header=LOCUS NC_003992 :
> Services.cgi
> [Fri Sep 12 11:29:50 2003] [warn] /cgi-bin/Services.cgi did not send an
> HTTP header
>
> So, this re-direction fails in a CGI environment :-(
>
> Same problem with retrievaltype => 'tempfile'
>
> M
>
>
>
> On Tue, 2003-09-09 at 13:47, Lincoln Stein wrote:
> > Sorry about any confusion this caused. However, it is mentioned in the docs
> > for WebDBSeqI. Perhaps the default should be changed to "tempfile", which
> > should work in all cases.
> >
> > Lincoln
> >
> > On Wednesday 20 August 2003 01:05 pm, Jason Stajich wrote:
> > > > So your script is doing what it's supposed to, it's just that some other
> > > > stuff is getting out on STDOUT before your webserver is able to get in
> > > > on the act.
> > > >
> > > > Having played a bit, this proves to be interesting:
> > > >
> > > > #!/usr/bin/perl -w
> > > > use strict;
> > > > use Bio::DB::GenBank;
> > > >
> > > > close STDOUT;
> > > >
> > > > my $d = Bio::DB::GenBank->new();
> > > > my $seq = $d -> get_Seq_by_gi('163483');
> > > >
> > > >
> > > > This gives me:
> > > >
> > > > print() on closed filehandle STDOUT at
> > > > /usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm line 701
> > > >
> > > > So WebDBSeqI.pm is usurping STDOUT as part of its query. This probably
> > > > explains what you're getting. Apache will redirect STDOUT straight to
> > > > the return stream for the connection. This means it gets the output
> > > > intended for WbDBSeq and it appears in your programs output. You then
> > > > get the output you printed.
> > >
> > > This is part of Lincoln's rechaining of the IO and using fork - looking
> > > at his comments in the code.
> > > # Try to create a stream using POSIX fork-and-pipe facility.
> > > # this is a *big* win when fetching thousands of sequences from
> > > # a web database because we can return the first entry while
> > > # transmission is still in progress.
> > > # Also, no need to keep sequence in memory or in a temporary file.
> > > # If this fails (Windows, MacOS 9), we fall back to non-pipelined
> > > # access.
> > >
> > > You can turn this off by adding to the DB::GenBank init
> > > my $db = new Bio::DB::GenBank(-retrievaltype => 'io_string');
> > >
> > > -retrievaltype => 'io_string' (for in-memory holding of the sequence
> > > before parsing)
> > > or
> > > -retrievaltype => 'temp' (for use of tempfiles, but I'm not 100%
> > > this code has gotten a workout to cleanup
> > > until the program exits which might be
> > > a problem for mod_perl running scripts)
> > >
> > > > If this is right, you should have some interesting error messages in
> > > > your logs if you run your script with warnings enabled.
> > > >
> > > > I can't see an immediate fix for this, short of running your fetch as a
> > > > completely detached process with a separate STDOUT, but that kind of
> > > > defeats the point of using mod-perl. The use of a pipe from STDOUT to
> > > > read the results of a webquery seem pretty engrained into WebQueryI.pm
> > > > and it may not be trivial to change it.
> > > >
> > > > Maybe others will be able to think of a simpler work-round?
> > > >
> > > >
> > > > Simon.
> > > >
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l at portal.open-bio.org
> > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > > --
> > > Jason Stajich
> > > Duke University
> > > jason at cgt.mc.duke.edu
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
More information about the Bioperl-l
mailing list