<div dir="ltr">Hi Alan, thanks for your feedback.<br>You've thrown some ideas that haven't crossed my mind. I was just wondering: why did you even find it necessary to cache distributions? What was the scale of your work? In my experience, aminoacid distributions of six complete eukaryotic proteomes up to k of 6 or 7 could fit into something like seven or so gigs (with no optimisation, without even numpy), so I thought nucleotide distributions will be prohibitive in terms of RAM only when there are tens of them.<br></div>