Traversing an array in random order / Non repetitive Random Number Generator

The problem is simple. You have an array and you want to traverse it in random order.

There is a trivial solution, to sort the array:

for(int i=N; i>1; --i)
      swap( array[ rand()%i ], array[ i-1 ] );

… but what if you can’t modify the array and having an array of indexes is too expensive, let’s say for example that you want to traverse all the pixels of the screen. What you actually need is a non repetitive random number generator which works in linear time.

There is a handy one around, it is called Linear Congruential Generator (LCG). This is the one most programming languages implement as their default random number generator.

The LCG equation is simple:
X’ = ( a * X + c) % m
Where X is the last generated number, X’ is the new one, a is an arbitrary constant multiplier, c an arbitrary constant increment and m is the modulus.
So this function jumps from number to number in a non sequential way, but to have a full period of m (to always fall in a different number without repeating it until m jumps) some conditions should be met:

0
a and c can be any finite number, positive or negative but it will be the same if they are 0 <= c < m, 0 <= a < m because of the modulus operation. For c it is easy to see it, but let see it for a:
Suppose we take a constant a’ which is any integer number, and represent it as a’ = a+m*i being a’ = a%m and i = a/m (all in integer arithmetic)
Then ( (a’+m*i)*X + c ) % m = ( a’*X + m*i*X + c ) % m
since (m*i*X) % m = 0 we end up with ( a’ * X + c ) % m
The next conditions are described thinking on a and c in ranges [0,m).
1
c and m should be relatively prime (coprime). This means that they don’t have any common divisor. Example: 22 = 11*2 and 25 = 5*5.
2
a-1 should be divisible by all prime factors of m. Suppose m is 1176 = 2*2*2*3*7*7, a-1 must be divisible by 2*3*7=42.
3
a-1 should be multiple of 4 if m is multiple of 4 (According to Donald E. Knuth).

The first X value can be any number, it will be the seed, after m iterations you will end in the same number(mod m).

When the LCG reach the initial value it will cycle in the same order it did the first time for ever (from a number X and the same constants you will always jump to the same number).

The quality of the randomness of the LCG is questionable for some critical computations, there are better random number generators, but for most requirements it is more than enough. The most important thing actually is to know what you are doing.
With different constants and seeds we will have different permutations of the numbers in the range [0,m).
Now the next problem is how to pick good constants.

If the size of the array is known in compilation time and only one way of traversing is needed you can simply chose the numbers doing some tests.
Now, if you don’t have the size of the array in compilation time and/or you need different ways of traversing the array it gets a bit more complicated.

Take into account that you can have an m bigger than N (the array size), when X >= N you can just discard this value in a way similar to:

do { X = LCG(X,a,c,m); } while( X >= N );

We can have a table of prime numbers, but figuring out which numbers to pick to either satisfy the coprime property with c and to get close to N will be a challenge as well …

Other approach would be to pick m to be a power of two:
– We can be sure to satisfy the coprime condition just making c odd.
– We can reduce the modulus operation to a bit mask operation gaining constant speed.
a can be calculated with a = ( ( s*4 )+1) & (m-1) where s is just a random number bigger than 0 and a-1 is multiple of 4 and divisible by all prime factors of m (which is only the number two).

There are two disadvantages about using a power of two as m .
– First, we can end up with up to the double less one iterations to calculate all the X; N can be far away from m
– The quality of the random numbers generated is less because of the power of two modulus, the low bits of X will vary in a smaller cycle than m (this can be compensated with the discard behavior).

With this in mind you can create a templated forward iterator to just traverse any container which accepts random access iterators.

See you,
- Juan Pablo

Leave a Comment