fasta splitter
Peter Rice
peter.rice at uk.lionbioscience.com
Tue Oct 8 16:37:36 UTC 2002
Tony Cox wrote:
> On Tue, 8 Oct 2002, January Weiner 3 wrote:
>
> Thanks to all that responded. I did, in the end write a 12 line bioperl script
> to split my fasta file. My request seems, however, to highlight a small blind
> spot on the EMBOSS radar. It appears that there are a number of implementations
> out there - perhaps one of them can be donated to the emboss project as the
> basis of a new software tool?
Nobody suggested hacking "seqret" to do what you want...
One problem doing this in EMBOSS is the need to generate filenames for your
split files - but maybe a base filename would be enough to generate
names. Then all you need to do is count sequences in a modified seqret.c
and change the output file. You can add a command line option for the
number of sequences in an output file. Cleaning up output files for a rerun
is an exercise for the user (unless you want to invent a new ACD type that
does it :-)
Needs a modified version of the seqFileReopen function to handle the file
naming, but nothing complicated is involved.
regards.
Peter
--
------------------------------------------------
Peter Rice, LION Bioscience Ltd, Cambridge, UK
peter.rice at uk.lionbioscience.com +44 1223 224723
More information about the EMBOSS
mailing list