This form of random number generator was pointed out in sci.crypt on the 6th October 2002 by Mok-Kong Shen, drawing our attention to the paper "Uniformly distributed sequences of p-adic integer, II", by Vladimir Anashin. This is available from http://www.arXiv.org/: select math and 2002, and then search for Anashin.
One way to generate random numbers is to repeatedly apply a fixed function f(), producing the sequence of random numbers xn+1 = f(xn). A simple example of this is f(x) = x + 1, producing the sequence 0, 1, 2, 3, ... That may not be very random, but at least it exhausts all possible values before repeating (assuming that x is held as an n-bit integer).
Many other choices for f divide the space of possible x into a large number of short cycles, into which other sequences coalesce. An especially bad choice might be f(x) = x2: f(0) = 0, f(1) = 1. If we start from 2 we get 2, 4, 16, 256, 65536, and then (with a 32-bit word) coalesce to 0. In Vol II of "The Art of Computer Programming", Knuth deals in great detail with random number generators. The answer given to Exercise 3.1.1(12) shows that for randomly chosen f(), you tend to end up with cycles of length about the square root of the number of distinct possible values of x: so with a 32-bit computer word, we could cycle after seeing only a few times 65,536 different values.
The situation is a little brighter if f(x) is a bijective: that is if it permutes the set of possible x amongst themselves. In this case no coalescense is possible, and we are interested in the cycle structure of a random permutation. Knuth is again at our service: In Vol I of "The Art of Computer Programming", Exercise 1.3.3(17) shows that if we pick a random permutation and a random starting point, then each possible length for the resulting cycle is equally probable, so the average cycle length is (n+1)/2, where there are n different possible values of x. If, on the other hand, you pick a random permutation, the probability that it consists of just a single cycle is 1/n, because there are n! permutations and (n-1)! ways of writing down a cycle starting at a pre-designated element (such as the smallest element in the set), or because the cases where the permuation consists of a single cycle are precisely those cases where picking any point leads to a cycle of length n.
Anashin showed that for a wide range of functions g(), the function
f(x) = 1 + x + 2(g(x + 1) - g(x))
provides a single cycle of maximum length. If we do our calculations
mod 2n, as is natural with an n-bit binary computer word,
then his list of possible g includes all those built up by composing
functions arbitrarily from the following list:
Note that all these operations share a common feature: the low k bits of the output depend only on the low k bits of the input, and not on any higher bits.
This is least obvious in exponentiation. Note that this is defined as (1+2y)z, so that a unique result exists even for -ve z. For z ≥ 0 we can work out the low k bits of the result given only the low k bits of y because we just have addition followed by repeated multiplication. To find (1+2y)-1 we could try all possible values of x until finding x such that (1+2y)x = 1. There is only one possible such value. But there is also only one possible k-bit x' such that (1+2y)x' = 1 mod 2k: for all other k-bit x' the result is not 1 mod 2k. Since we already know that for multiplication the low order k bits of the result depend only on the low order k bits of the input, the low order k bits of x and x' must be identical, and we can predict the low order k bits of (1+2y)-1 from the low order k bits of y.
So we can work out the low order k bits of (1+2y)z from the low order k bits of y. (1+2y)2 = 1 + 4y + 4y2 = 1 + 4y(y+1) = 1 + 8b for some integer b, since either y or (y+1) must be divisible by 2. Since x2c+d = (x2)cxd this means that the low order 3 bits of (1+2y)z do not depend on the bits of z above the low order bit. If (1+2y)2^k = 1 + 2db for some integer b, then (1+2y)2^(k+1) = ((1+2y)2^k)2 = 1 + 2d+1b + 22dd2, so we can show by induction that the contribution from the 2ks bit of z is 1 mod 2k and so the low order k bits of (1+2y)z depend only on the low order k bits of z, (and y, as we showed in the previous paragraph).
Anashin's paper makes use of mathematics beyond me (such as the p-adic numbers): in fact I cannot even guarantee that I have stated a restricted version of his theorem correctly. However, I find this result so interesting that I have attempted to provide an elementary proof of what may, for all I know, be only a related result. Here it is:
Let g(x) be a function on k-bit integers, with the property that the low l bits of the output depend only on the low l bits of the input for all l ≤ k.
Let f(x) = 1 + x + 2 (g(x + 1) - g(x)) mod 2^k
f(x) also has the property that the low order l bits of its output depend only on the low order l bits of its input. In particular, note that the low order bit of the sequence produced from f(x) alternates 0,1,0,1... I will prove that f(x) generates a full cycle of possible values.
Suppose that the low order l bits of the sequence produced from xn+1 = f(xn) form a cycle of length 2^l. We have just seen that this is the case for l=1. Consider the low order l+1 bits of the sequence produced from f(x). Because of the cycle in the low order l bits, this must either form a single cycle of length 2l+1, or two cycles of length 2l. Suppose that we have two cycles of length 2l. Consider any one of these cycles - for example, the one starting with x0 = 0. If instead of looking at x0, x1, .. x2^l-1 we look at f(x0), f(x1),.. f(x2^l-1) we have the same numbers, just in a different order, because of the cycle at 2l. So the sums of the two sets of 2l numbers must be the same. If we write them down in two columns we are comparing xi with 1 + xi + 2(g(xi + 1) - g(xi)). If we subtract one column from the other we are left with 2l + SUMi (2 * g(xi + 1) - 2 * g(xi)). Here we are dealing with numbers mod 2l+1. Because of the multiplication by 2, we can discard the top bit of the output of g(xi + 1) and g(xi). But this means that these contributions depend only on the low order l bits of xi and xi + 1. We have a cycle of length 2l in the low order l bits, so we see every possible combination of l bits in the values of both xi and (xi + 1). This means that every output seen from g(xi + 1) is also seen from g(xi), and the contributions from these two terms cancel out. So the result of subtracting our two columns from each other is 2l. But this means that the sums of the two sets of columns are not the same and we have a contradiction, so we can only have a single cycle of length 2l + 1 in the low order l+1 bits. So by induction, the sequence produced by xn + 1 = f(xn) is of full cycle.
0 1 + 0 + 2(1 - 0) = 3 3 1 + 3 + 2(0 - 1) = 2 2 1 + 2 + 2(1 - 4) = 5 5 1 + 5 + 2(4 - 1) = 4Clearly this does not cycle at 22 because we go 0, 3, 2, 5, 4 - and you can also see the difference between the two columns adding up to 4 != 0 mod 8.
Most of these comments can be illustrated by considering particular g(x).
Many people pointed out that this result, on its own, does not provide any guarantee of the randomness of the resulting sequence, let alone its security. For example, if g(x) = 0 for all x, we have the sequence 0,1,2,3... My interest is in the simplicity of the rule, rather than its possible practical applications.
Bob Harris pointed out that there will be full-cycle generators not of this form. If a permuation is available, it can be used to generate a full cycle sequence as xi = f(i) (or xi+1=f(f-1(xi) + 1). All possible full cycles can clearly be generated in this way, including those whose low bits do not cycle with smaller period. He also pointed out that by using a general quadratic g() in Anashin's construction you can retrieve all full-cycle sequences given by the criteria listed in Knuth for an LCG mod 2n to generate the full cycle (Knuth Vol II Theorem A section 3.2.1.2).
On the 6th Dec 2002 Mok-Kong Shen posted some comments passed to him
in an email by Anashin to sci.crypt (and sci.crypt.random-numbers).
He pointed out that you can apply a function to the sequence to yield
a further sequence in which every bit position has the maximum possible
cycle length, although it is still not clear what the cryptographic
properties are. The suggested function is
F(x)=(1+ I(x))XOR(2h(I(x)AND(-2)))(mod 2^k)
where I(x) is the bitwise reversed version of x (so for a 3-bit word
8=100 becomes 1=001), and h() is any function obeying the same rules
as g, so that the low k bits of h(x) depend only on the low k bits
of x.
To see this, consider first of all the term (1+ I(x)). The sequence of x values is of full length and so contains all 2n possible n-bit words. Since the low n-1 bits cycle with length 2n-1 we know that the sequence of x values contains a stream of 2n-1 n-bit words followed by a repeat of itself, except that the top bit of each word has been flipped. Therefore this sequence contains the words 1111..11 and 0111..11 every 2n-1 words. After modification according to (1+ I(x)) these become 0000..00 and 1111..11 - again every 2n-1 words. So no bit position can cycle with length 2n-1. Since we know that everything bit cycles with length 2n, any shorter cycle must be a divisor of this, but it would then also be a divisor of 2n-1 so every bit position has full cycle.
The remaining term is 2h(I(x) AND (-2)). But -2 is 1111..10, so this knocks out the low bit of I(x), and the contribution is the same for both x=1111..11 and 0111..11, so our result still has full cycle on every bit. Also, the low k bits of 2h(I(x) AND (-2)) depend only on the low k-1 bits of I(x) (including the bit we know is masked out) so our F(x) is again a bijection: we can work out the low bit of x from the low bit of F(x), and then the low 2 bits of x from the low bit of x and the low 2 bits of F(x), and so on. So the output sequence consists of every possible n-bit word.