[EMBOSS] Reading Two Sequences from stdin with water

pmr at ebi.ac.uk pmr at ebi.ac.uk
Fri Jun 3 14:09:03 UTC 2005


Jan T. Kim writes:
> is it possible to read both input sequences to a pairwise alignment
> from one input stream?
>
>     cat x.fasta | water -asequence fasta::stdin:seq1 -bsequence
> fasta::stdin:seq2 -outfile stdout -auto
>
> gives
>
>    EMBOSS An error in ajfile.c at line 1926:
> Error reading from file 'stdin'
>
> It may well be that water consumes the entire input stream on getting the
> first sequence, thus rendering itself unable to acquire the second one.
>
> Is there a solution to this? I would really like to avoid the mess of
> temporary files and run water in a clean pipe (pun intended  ;-)  )

EMBOSS will only cleanly read stdin as one input. We should probably trap
that internally and give an error if we find stdin opening again. I wonder
whether there is any useful way to share the stdin filebuffer. Hmmmm... in
the early days of EMBOSS we decided not to allow it, but it could be worth
a try. You would still be in trouble if you tried to read the second
sequence first though.

Assuming your x.fasta file has only seq1 and seq2 in that order, reading
seq1 will continue until the first line of seq2 is reached. By then it
would be too late for seq2 to be read cleanly.

At least you have fasta:: specified - with no specified format, EMBOSS has
to read a long way into the input just to check whether it is really GCG
format.

As for the asis format, I suppose an EMBOSS utility that reads x.fasta and
outputs asis::ctagtacgatgcgatcg asis::tgatcgatggctacgtagc would be useful
to you - then you could put `sillyname x.fasta` in your command line... at
least until the command line gets too long. Hard to preserve the ID and
description of the sequences though.

"If you think water is pure, just remember what fish do in it."

Hope that helps,

Peter




More information about the EMBOSS mailing list