[Biopython-dev] 'testseq' function update #3

Fri Jun 2 09:06:09 UTC 2017

Hi Adil,

Are you still thinking of putting this under the Scripts/ folder,
or Doc/examples maybe?

Either way, since your script has doctests we ought to be able
to hook that into the main test suite via our run_tests.py - I
would guess something like a little file Tests/test_testseq.py
which runs your doctests would work for this?

One thing to watch out for with doctests is they often show
up subtle differences cross platform (e.g. Windows vs Linux,
Python 2.7 vs 3.3 vs 3.6, or C Python vs PyPy vs Jython).

A pull request sounds good :)

Peter

On Thu, Jun 1, 2017 at 10:18 PM, Adil Iqbal <aiqbal85 at gmail.com> wrote:

> Hello again!
>
> I've detailed the changes below. You can view updated version of my code
> here: https://github.com/Adil-Iqbal/Personal-Projects/blob/
> master/Test%20Sequence/testseq.py
>
> I believe I have found a satisfactory solution based on Andrew's
> suggestion.
>
> Before I talk about the solution, I'd like to quickly recap what I
> discussed in an earlier email; once the global instance of the
> random.Random class is seeded, it cannot be reverted to its original
> behavior (which is to re-seed itself with every function call). Instead
> what happens is that the Random class seeds based on the system date/time
> -- which means it can only generate one new sequence ever few milliseconds.
> That is undesirable since I would like the function to be able to produce a
> unique sequence with every function call if desired by the user. In cases
> involving for-loops, the function would fail if seeded using the system
> date/time.
>
> I tried using a global variable named "shuffle_seed" to seed the RNG,
> which I would incremented with every call, but that would require biopython
> to use memory to track how many times the user ran the function. That was
> not ideal since the user's code should be allowed to proceed as
> independently as possible.
>
> I then tried to implement Andrew's most recent suggestion verbatim. Which
> was to instantiate the random.Random class outside of the function
> definition and seed it within the function only if the seed was declared as
> an argument. The benefit of this method was that I was only creating one
> instance of the Random class and not having to track the user's function
> calls. Unfortunately, once the global instance was seeded, it began to
> generate sequences based on date/time again.
>
> The solution that works well is that I created 2 global instances of the
> random class called "seeded_instance" and "anchor_instance." The seeded
> instance will be seeded with the "rand_seed" argument every function call.
> If a seed is not declared, the anchor_instance will be invoked to assign a
> random value to "rand_seed." Since anchor_instance is never seeded, it
> retains the original desired behavior that I was after.
>
> This method is efficient, in that only two instances of the Random class
> are required and generated upon loading the module. It's also entirely
> independent of the users code AND the rest of biopython, since everything
> is instantiated. Most importantly, it works perfectly. You may not be able
> to see it on my github code, but I have been running doctests on my local
> copy of the code and everything checks out.
>
> I'm open to any other suggestions. Are there any other things that I
> should do?  Would this code be ready for a pull request?
>
> Best,
> Adil Iqbal
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython-dev/attachments/20170602/70fe8483/attachment.html>