Why must the set seed instruction and the random draws be in the same cell in Jupyter? - Python

TopAnswers Python

Meta

Databases

TeX

Code Golf

APL

C++

.net

db<>fiddle

Java

*nix

PHP

PowerShell

Python

Rust

टेक्-मराठी

Typst

Web Client Dev

Web Server Dev

Why must the set seed instruction and the random draws be in the same cell in Jupyter?

add tag

anoldmaninthesea

I'm trying to set the seed for a few lines of code in a jupyter notebook. 

However, when I tried to run numpy.random.seed(0) in the initial cell, in the later cells the random generator was not 'seeded'. I had to write a set seed in all the cells where I was running the random draws... Why is that? Is there a way to avoid having to write the same thing in every cell where I have a random draw? Here's an example:


```
import numpy as np

np.random.seed(43)
np.random.normal(0,1,20)
```
 I got 
```
 array([ 0.25739993, -0.90848143, -0.37850311, -0.5349156 ,  0.85807335,
        -0.41300998,  0.49818858,  2.01019925,  1.26286154, -0.43921486,
        -0.34643789,  0.45531966, -1.66866271, -0.8620855 ,  0.49291085,
        -0.1243134 ,  1.93513629, -0.61844265, -1.04683899, -0.88961759])
```
Now, when I ran `np.random.normal(0,1,20)` again in a new cell, I got

```
array([ 0.01404054, -0.16082969,  2.23035965, -0.39911572,  0.05444456,
        0.88418182, -0.10798056,  0.55560698,  0.39490664,  0.83720502,
       -1.40787817,  0.80784941, -0.13828364,  0.18717859, -0.38665814,
        1.65904873, -2.04706913,  1.39931699, -0.67900712,  1.52898513])
```
So, despite having seeded the random generator, the random draws were not equal... Is there a way to solve this?

Also, I checked the [numpy documentation](https://numpy.org/doc/stable/reference/random/generated/numpy.random.seed.html?highlight=random%20seed#numpy.random.seed
)  and it seems that `numpy.random.seed()` is not the best practice. I don't understand the example given there... How would I go about, according to the best practices shown in the examples, in order to do something similar to `numpy.random.seed(constant)`?

Top Answer

wizzwizz4

A pseudo-random number generator can be modelled as an infinite stream of bits. Every time you generate a number using the pseudo-random number generator, it shifts the stream along, returning and discarding the bits at the beginning. Seeding the PRNG discards the current stream and selects the one specified by the seed.

Each time you generate a number from the PRNG, _that many bits are removed from it_. This means that the PRNG states changes each time you generate a number from it. If you don't reset the PRNG's state by re-seeding it, it'll be a while (centuries) before its output repeats.

There is a workaround: create a new PRNG each time.

    import numpy as np
    
    np.random.RandomSeed(43).normal(0, 1, 20)

You could create a convenience function:

    def pseudorandom(seed):
        return np.random.RandomSeed(seed)
    
    pseudorandom(43).normal(0, 1, 20)

Note that if you are repeatedly re-seeding your PRNG, it will **not give you secure numbers** (not that this PRNG is secure _anyway_). They might _look_ random, but they're very predictable; all you need to guess is the seed. To be on the safe side, treat them like they are _entirely_ predictable.

Answer #2

anoldmaninthesea

I would like to tie up the answer from WizzWizz 4, with the example from NumPy documentation.

So, the preferred method to seed the sequence is the following:


```
from numpy.random import MT19937
from numpy.random import RandomState, SeedSequence

def seed(x):
    return RandomState(MT19937(SeedSequence(x)))
seed(42).normal(0,1,10)
```

I still don't know what's MT19937, but from a practical POV, I think I'm satisfied. That's not to say that if you know more, you don't need to share... You're welcome to share. Thanks ;)

Answer #3

wizzwizz4

If you're using this for testing purposes, then a _terrible_ **never use in production ever** solution:

```python
@getattr((lambda TODO: TODO("FIXME: Delete me _")), '__call__')
def DebugOnly_prng_OVERRIDE(comment_me_out_please):
    # BUG: Will completely break randomness
    # FIXME: Render this code non-functional
    # HACK: Monkeypatches numpy.random, breaking it
    # XXX: considered harmful
    """This code should not run. Delete it if it does.

    >>> 2 + 2
    5
    """
    from functools import wraps
    from numpy import random
    exec("random._RESEED_VALUE = 0")

    real_seed = random.seed

    def override(f):
        @wraps(f)
        def g(*args, **kwargs):
            real_seed(random._RESEED_VALUE)
            return f(*args, **kwargs)
        return g

    if type(real_seed) is type(override):
        # This has already run (or it's not CPython)
        return

    if not __debug__:
        raise RuntimeError("Bad code made it to optimisation.")
    for name in set(dir(random.RandomState)) & set(dir(random)):
        if name[0] != comment_me_out_please[-1]:
            setattr(random, name, override(getattr(random, name)))

    @wraps(real_seed)
    def seed(seed):
        random._RESEED_VALUE = seed
        real_seed(seed)
    random.seed = seed
```

This monkeypatches the `numpy.random` module, letting you use your original code _verbatim_ with the behaviour you expect. This is _not the behaviour any other code expects_, and this is a _global override_, so things will break in subtle ways if you put this in production. (I've tried to make it break more explicitly, but don't rely on that.)

3 Answers