[EMBOSS] how string is shuffleseq

David Mathog mathog at caltech.edu
Mon Aug 4 17:39:36 UTC 2008


> Not a direct answer to your question - but there was an interesting 
> paper on shuffling recently, did you see it:
> 
> Jiang et al. (2008) uShuffle: A useful tool for shuffling biological 
> sequences while preserving the k-let counts. BMC Bioinformatics 9: 192.

An even less direct answer - I finally got around to rewriting and
generalizing my make_random_dna program from GCG/Fortran to
make_random_seq in standard C, with a GPL 2 license.  So now I can
finally distribute it, instead of having to tell people who ask for it
they can't have it because of licensing issues. 

This program does not maintain the tuple ratios exactly as would (some)
shuffle programs, but it comes arbitrarily close to doing so on a long
enough generated sequence.   It is normally used to generate control
sequences for other programs, as are the various shuffle programs.  The
original would only do DNA, the new one will do DNA, protein, ASCII, or
any specified subset of ASCII.  

It is available here:

http://saf.bio.caltech.edu/pub/software/molbio/make_random_seq.c

Regards,

David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech



More information about the EMBOSS mailing list