fasta splitter

Tony Cox avc at sanger.ac.uk
Tue Oct 8 16:39:31 UTC 2002


On Tue, 8 Oct 2002, Peter Rice wrote:

that sounds excellent - does this mean it really will make it in to the EMBOSS
release? (any idea when? ;)

Tony


+>Tony Cox wrote:
+>> On Tue, 8 Oct 2002, January Weiner 3 wrote:
+>>
+>> Thanks to all that responded. I did, in the end write a 12 line bioperl script
+>> to split my fasta file. My request seems, however, to highlight a small blind
+>> spot on the EMBOSS radar. It appears that there are a number of implementations
+>> out there - perhaps one of them can be donated to the emboss project as the
+>> basis of a new software tool?
+>
+>Nobody suggested hacking "seqret" to do what you want...
+>
+>One problem doing this in EMBOSS is the need to generate filenames for your
+>  split files - but maybe a base filename would be enough to generate
+>names. Then all you need to do is count sequences in a modified seqret.c
+>and change the output file. You can add a command line option for the
+>number of sequences in an output file. Cleaning up output files for a rerun
+>is an exercise for the user (unless you want to invent a new ACD type that
+>does it :-)
+>
+>Needs a modified version of the seqFileReopen function to handle the file
+>naming, but nothing complicated is involved.
+>
+>regards.
+>
+>Peter
+>
+>--
+>------------------------------------------------
+>Peter Rice, LION Bioscience Ltd, Cambridge, UK
+>peter.rice at uk.lionbioscience.com +44 1223 224723
+>

******************************************************
Tony Cox			Email:avc at sanger.ac.uk
Sanger Institute		WWW:www.sanger.ac.uk
Wellcome Trust Genome Campus	Webmaster
Hinxton				Tel: +44 1223 834244
Cambs. CB10 1SA			Fax: +44 1223 494919
******************************************************




More information about the EMBOSS mailing list