needle -filter

Jonathan Barber jon at compbio.dundee.ac.uk
Tue Jun 10 11:58:34 UTC 2003


On Tue, Jun 10, 2003 at 12:15:29PM +0100, simon andrews (BI) wrote:
> 
> 
> > -----Original Message-----
> > From: Jonathan Barber [mailto:jon at compbio.dundee.ac.uk]
> > Sent: 10 June 2003 10:31
> > To: 'emboss at embnet.org'
> > Subject: Re: needle -filter
> > 
> > > As an aside, the idea of the asis: USA is a really nice way around
> > > this whole problem.  The trouble is that it's limited (I presume) by
> > > the command line length your shell allows, and by you not being able
> > > to specify a name for the sequence.  If there was some easy 
> > > way around  this limitation, then that would solve the problem for the 
> > > situations I can think I'm likely to encounter.
> > 
> > Again, if you're using Perl, then you can use system() with a list
> > argument rather than a scalar, and this will avoid the shell, if your
> > command doesn't have any shell metacharacters in it (perldoc 
> > -f system).
> 
> System doesn't help me as I can't read STDOUT from it, but looking at
> the docs for IPC::Open3 it looks like I can do the whole thing that
> way instead.  Cheers for the pointer, I wouldn't have throught about
> doing it that way!

That's the way I usually handle the emboss programs as well, but I
didn't think of using the asis, nifty.

> 
> After a bit of playing, it's not quite as easy as I'd hoped, but I got
> it to work.  The script at the bottom shows one way to get needle to 
> work without having to write anything to disk.
> 
> One quick extra feature request though.  For the asis:: USA, would it be
> possible to assign a name to the sequences passed in, even if it's 
> only 1,2,3 or seq1 seq2 seq3 etc.  It would make parsing of output files
> much easier.  A way to specify a name in the USA would be even better
> (eg asis::name>GAGAGTGTAGT or whatever).
> 
> Cheers for all the help on this, it seems to have provoked some interesting
> discussion.
> 
> 	TTFN
> 
> 	Simon.
> 
> 
> #!/usr/bin/perl -w
> use strict;
> use IPC::Open3;
> 
> my $seq_a = 'CCAGCCCATTTATCTATACCATGAGGTAACTGAAGTAAGGAGAGCAGTGA';
> 
> my $seq_b = 'CCAGCCCATTTATCTATACCATGAGGTTTCTGAAGTAAGGAGAGCAGTGA';
> 
> open (ERRORS,'>/dev/null') || die "Can't open /dev/null :$!";
> 
> open3 (\*INPUT,\*OUTPUT,\*ERRORS,'needle',"asis::$seq_a","asis::$seq_b");
> 
> print INPUT "10\n";
> print INPUT "0.5\n";
> print INPUT "stdout\n";
> print INPUT "\n";
> 
> close INPUT;
> 
> print "OUTPUT: $_" while (<OUTPUT>);

One small point on using open3() is that you should probably record the
pid and waitpid() on it to prevent zombies:
 
my $pid = open3 (\*INPUT,\*OUTPUT,\*ERRORS,'needle',"asis::$seq_a","asis::$seq_b");
print INPUT <<FOO;
10
0.5
stdout

FOO;

print "OUTPUT: $_" while <OUTPUT>;
waitpid $pid, 0;

-- 
Jon



More information about the EMBOSS mailing list