115

Could someone explain if and why the following random number function from XKCD works?

I have never quite been able to understand it. I ask the question on SO hoping to find a more "human" explanation that could be understood by me.

RFC 1149.5 specifies 4 as the standard IEEE-vetted random number.

456

Tour of Accounting

139

I am not able to rightly apprehend the kind of confusion of ideas that could provoke such a question.

-- Charles Babbage

95

The comment implies that who ever wrote it just rolled dice to get the number and is having it return the results of a dice roll. While the results of the dice roll are going to be truly random as opposed to pseudorandom, it misses the spirit of what a random number function is.

Thus, it is a joke.

68

I think you have to be really pedantic on this one, part of its flavor.

The function name "getRandomNumber()" basically means "Get a number that is random". Here, the number "4" was arrived at randomly, through a dice roll. Thus, 4 IS a random number. This function works PERFECTLY. It does exactly what is asked for, a random number.

This is a classic case of "exactly what you asked for, but not what you wanted".

60

From the outside of any given random number function, it's impossible to tell whether or not the numbers coming out of it are in fact random or not.

Seeing a stream of 4s come out would indicate that it's possibly not random, but alternatively, it could be a good time to get a lotto ticket (or a bad one, you can't tell).

Looking at the function's internals you can tell that it's not using a random source to produce those values, but without seeing that you'd never know for sure.

This is because it's perfectly legitimate for a random source to repeat. Lots of people have the funny concept the following are not random:

  • 4 4 4 4 4 4 4 4
  • 1 2 3 4 5 6 7 8
  • 8 7 6 5 4 3 2 1
  • 1 3 5 7 9 11 13

Sure, it's statistically unlikely that a random number generator will emit those patterns, but if it does, it's not a sign of not being random.

At the basic level, all sequences can be represented with bit patterns, 1s and 0s, and there are only so many combinations of that.

0        1

now the following sequences of length 2 are fully legitimate random sequences

00
01 
10 
11 

And as a result, so is

00000000000000000000
11111111111111111111

and

10101010101010101010

A random number generator can easily produce these sequences without failing to be random. Sure, the longer the run is of bits it emits the more likely they are to change, and the longer a sequence is the less likely it is to be emitted by a random number generator, but that won't stop it from generating it.

I'll put it like this:

There are billions of people on the earth.

If you go out today and buy a lotto ticket, the chance of you winning is heavily against you.

But somebody wins, every day. Some people win more than once.

59

See "What Colour are your bits?". Randomness is not a property of a number, but of the process that produces the number.

14

Another example: The iPod had a problem with its shuffle feature, because it "wasn't random enough"; specifically, it would play the same artist several times in a row, or play a song more often, and so on. A lot of users complained about this, commenting that their iPod had a 'preference'. Maybe it liked Creedence (who wouldn't!) or refused to play Michael Jackson.

In reality, the problem was that it was TOO random, or at least it was as random as it could be (pseudo-random). Apple eventually added a slider to determine how 'random' you wanted songs to be. More 'random' actually meant less random, as it tracked which songs it had played recently and avoided similar songs.

Especially with songs, there are so many connections with other songs (same artist, same genre, same year, same album) that it's easy to see patterns that aren't actually there. The problem with randomness is that given a small enough set of unconnected possibilities, it won't seem random at all.

13

The programmer misunderstood the requirement 'return a random number'.

While the number was random at design-time it has been hard-coded and so it will always return the same number. The requirement most certainly meant that the function should return a different number each time at runtime.

Basically, the programmer screwed up in a funny way.

I guess you could clarify the requirement 'return a random number' with 'at runtime' and maybe the programmer will get the function right. I doubt it though.

I don't think "it works".

8

Snippet seen in a project I previously worked:

/**
 * Return a random number between 1 to 5
 */
private int getRandom() {
    int rand = Random.nextInt(6); //number between 0 to 5 (inclusive)
    if (rand == 0)
        rand++;
    return rand;    
}

Unfortunately, it wasn't a lottery app else I'd be a millionaire by now.

5

It works perfectly. With the caveat that every time the function is called, the programmer needs to roll the dice again, rewrite the function, and recompile the program. I assume that the number of dice and/or the probability distribution of the function are documented somewhere. Or not.

3

Technically, this function generates a random number sequence of just one number. More useful functions would generate longer sequences. Even more useful functions would generate different sequences when called multiple times. However, it could be argued that this function produces a "better" random number than more useful functions since, if the comment is to be believed, the sequence was generated by a universally acknowledged randomization procedure: a die roll.

Most "random number" generators are actually deterministic algorithms that are difficult to predict if you don't have access to some set of inputs. Traditionally, the needed input is called a seed and is a number a program provides to a random number generator. In turn, the seed is often generated by using the computer's internal clock so that each instance of the program will provide different results. But the sequences generated, while difficult to predict, are not random in the sense of non-deterministic. In fact, the deterministic nature of these sequences is the basis for many encryption techniques.

A truly random sequence would be both difficult to predict and non-deterministic (and meet certain statistical tests). Die rolls (as in the cartoon) are a good source for random sequences. But for most applications, the cost of generating a truly random sequence far outweighs the benefits compared to pseudo-random sequences.

2

You can use atmospheric noise like these guys to generate true randomness.

0

Thanks for re-opening the question ;)

I don't know why Preets thinks the function "works", and what is meant by "working".

A Random Number Generator should generate ...uhm... random numbers. Throw of dice can be seen as random number generation, as the resulting SEQUENCE of numbers appears to be random.

However, as has been pointed out by others, a single NUMBER cannot be "random".

The joke is that you take a random number generator, take one value of it, and claim that repeating that same result is still a random generator.

The function in the cartoon "generates" the same number over and over, is therefore a predictable sequence, and thus not random.

0

the algorithm is perfectly correct if you use it once and the user did not observe the source. if you want to use it another time (to return a random number over N), the XKCD recipe say:

  • build N different getRandomNumber() functions with the N different numbers,
  • delete N-1 programs randomly.

you have to be a lazy programmer to do something shorter and an over-zealous user to see the difference.

-5

This is misleading. This would be the right implementation.

int getRandomNumber(){

  getHumantoRolldice();
  return captureOutputofRolldiceEvent();

}