[Bioperl-l] Creating FASTA library.
Robert Citek
rwcitek@alum.calberkeley.org
Fri, 30 Aug 2002 10:15:44 -0500
Hello Jean-Jack,
At 10:10 AM 8/30/2002 +0100, Jean-Jack Riethoven wrote:
>If you have thousands of files and if the MacOS X shells have the same
>limitations of the 'normal' unix shells (you get a "Arg list too long" or
>similar message), you can use this workaround:
>
>find . -name "*.pep" -exec cat \{\} >> file_for_clustalw \;
This way will fork 'cat' and open/close file_for_clustalw for every single
.pep file, which will probably be very slow. If you do have thousands of
files, a modified method would be to include the xargs command:
find . -type f -name "*.pep" | xargs cat > file_for_clustalw
I've included the "-type f" switch to ensure that you only get files and
not directories. Also, the ">>" was changed to ">". xargs knows how to
limit the length of the command line. You can also force xargs to only
take a certain number of arguments with the --max-lines option and have it
behave nicely with names containing spaces with the -0 option:
find . -type f -name "*.pep" -print0 |
xargs -0 --max-lines=100 cat > file_for_clustalw
Regards,
- Robert