[EMBOSS] EMBOSS seqret : IntelliGenetics and new DOS lines

Daniel Barker db60 at st-andrews.ac.uk
Tue Jul 21 11:24:28 UTC 2009


Dear Peters et al.,

EMBOSS claims not to care about whether newlines are DOS or UNIX:

'EMBOSS programs can read in both PC and Unix text file formats, so it 
is not necessary for you to use this utility all of the time' - noreturn 
documentation.

This would certainly be good. 'The newline problem' must be the single 
biggest computational waste of time I've experienced over the years!

It's easy to avoid with tr, u2d, d2u, noreturn, etc. - but it's just one 
other thing that can go wrong, especially when data is shared between 
different places.

I've noticed a small amount of software, in the world in general, still 
uses the Mac OS 9 (and earlier) convention where newline is \015 only. 
E.g. this tab-delimited text saved from Excel 2004 for Mac:

$ od -bc Workbook1.txt
0000000   061 011 062 011 063 015 064 011 065 011 066
            1  \t   2  \t   3  \r   4  \t   5  \t   6
0000013
$

I expect this usage will decline, since it's in conflict with the 
convention of Mac OS X's own command-line tools (\012 only, like Linux). 
Probably the '\015 only' convention hasn't had much impact on 
bioinformatics anyway?

Best wishes,

Daniel

-- 
Daniel Barker
http://bio.st-andrews.ac.uk/staff/db60.htm
The University of St Andrews is a charity registered in Scotland :
No SC013532



More information about the EMBOSS mailing list