How is computer randomization generated?

Asked

Viewed 5,796 times

174

Doubts

  1. How computer randomization is done?
  2. What algorithm or mathematical basis does the computer use to generate those numbers?

For example:

No Javascript utilizo o Math.random() it returns me different numbers every time like for example: 0.8908976025413722, 0.7090631313621998 and etc.

6 answers

176


The only really random thing are quantum effects, such as radioactive decay (which of the nuclei will break down now?). And that’s kind of tricky to get on home computers.

Several of these algorithms (in particular the rand() of C, which many languages use) are based on pseudo-random generators. They generate a sequence of numbers that looks like random to a person, but they are totally predictable if they depart from a known Seed. They are therefore very bad for security or encryption. They take a Seed (given initial value, usually from an external source, such as the start date/time of the process) and apply transformations to generate the next value. The upside is they’re fast. And the fact that they are predictable allows you to save the Seed of a generated puzzle as your id, for example, and not have to spend any space with the puzzle itself, just generate again by the same Seed.

For example: (I just invented this algorithm, it’s not good)

static double seed; // uma seed entre 0 e 1
double random() {
   seed /= 1125899839733759;  // um número primo
   seed *= 18014398241046527; // outro
   seed -= (int)seed;         // normalizar
   return seed;
}

int main() {
    seed = 0.753;
    printf("%f\n", random()); // 0.048001
    printf("%f\n", random()); // 0.768009
    printf("%f\n", random()); // 0.288139
    printf("%f\n", random()); // 0.610224
    printf("%f\n", random()); // 0.763582
}

(coliru)

One of the most commonly used algorithms for this is the Mersenne Twister, which is capable of generating 219937-1 numbers before repeating the sequence.

For any such case, it is important to be careful in choosing Seed. If an attacker gains knowledge of the Seed used by you, he can predict every move of the program. An example of this type of attack against an online poker site can be found in this article: How We Learned to Cheat at Online Poker: A Study in Software Security. The attack starts with the knowledge that the function Randomize() Pascal’s is used and that she uses as Seed the number of milliseconds since midnight. The problem is that I can kick the exact time that happened to server randomization (ie the time that poker was started). Considering you miss a minute more or less, it’s only 180,000 possible sequences. An extremely small number for a computer, just test all possible sequences and compare which generates the cards I have in hand. The moment you figure out what the sequence is, you can extrapolate and know the cards in every player’s hand. Knowledge learned: Using only the watch as the source of a Seed that should be random and hidden is a bad idea.

Other methods are based on entropy sources that can be obtained from hardware, such as keyboard usage analysis, mouse movement, packet traffic on the network, etc. Having sufficient data will be as close to random as possible. On Linux you can read from this source through pseudo-files /dev/random or /dev/urandom. They are a good source if you need real random number and can be used to initialize the Seed of a pseudo-random generator if convenient.

  • 11

    It should be remembered that many of the modern generators, even without access to an "external random generator" (electronic device of quantum events that you mentioned), have access to the clock. The clock helps to "sow" randomization without repeating, and this is even present in calculators. Query the data memory randomly (/devs cited) also helps and can mix clock data, but requires more complex interruptions and treatment.

  • 3

    Brownian movements are truly random.

  • 1

    @Peterkrauss There are several ways to get a good mix (random data with pseudo-random) without this interruption problem. According to that response in security.SE, many people think that /dev/urandom/ is problematic because "in the absence of sufficient entropy it falls on pseudo-random numbers", but in practice this should not occur never. Also, as I commented in my answer, there are other phenomena unpredictable enough that are indistinguishable from pure randomness.

  • @mgibsonbr Corretíssimo, and as 90% of uses are not scientific, and 90% of users do not demand high encryption, solved in this context. I must stick only to the simplest machines, without "memory for mixing", but with clock availability (a very cheap feature)... There is also the didactic example, of the experiment where new ("virgin") and identical machines will always have the same seed so the same random result. The clock softens the situation if there’s no timing, and something like /dev/urandom/ solves right at the first different bits between them.

  • 6

    Getting entropy from physical/quantum effects may not be so difficult. Some of the noise on an off-air TV or radio comes directly from the Big Bang. That is, You can constitute a random generator of very good quality simply by attaching a radio out of the station on the sound board :)

  • 6

    This business of the /dev/urandom being less secure is superstition https://gist.github.com/tarcieri/6347417

  • 2

    Taking advantage of the TV interference tip as a random generator through an RF circuit coupled to the computer. Using any microcontroller just build a small circuit with a wire or track of certain length coupled to one of the analog ports and read it to generate the SEED of the random number generator, at each reading of the analog port this will capitalise the electrical interference present in the environment, and may even suffer capacitive effects with the proximity of metallic objects or the human body.

Show 2 more comments

66

Classical pseudo random number generators (PRNG) work as follows (LC algorithm):

 x1 = (a . x0 + b) mod n

where x0 is the "seed", or previous random number, a and b are chosen constants, and n is the largest random number desired. Remembering that mod is the operation of the rest of the division.

This is used with integers, probably Javascript generates an integer number of 64 bits or greater and divides by a constant before returning via Math.random(), since in this function the number comes between 0 and 1.

The quality of this PRNG depends on the good choice of a and b. To see which constants and/or the exact implementation the way is to consult the source of the Javascript implementation you are interested in, if it is open source (V8/Node is, Mozilla is, others are not).

There are more modern Prngs than LC. Mersenne Twister is widely used and recommended. Since Javascript’s Math.Random() cannot be "seeded", it can make use of any PRNG, so just by consulting the source to see which one it is.

52

Randomness is related to the concept of unpredictability: given a number (or sequence of numbers) you are unable to predict what the next number will be. In addition, other requirements may be present, such as the probability of each number within the domain being drawn (which, in general, must be the same).

Computers in general are deterministic, ie: the same program executed twice with all identical entries will produce two identical outputs. For this reason, a external source is necessary to produce really random results. This is not something the computer can produce by itself.

As such a source is rarely present, several algorithms have been designed to produce sequences of numbers that are - to some degree - indistinguishable from a random sequence, from a single starting value (seed, or Seed). This starting value has to be different with each invocation of the program, but it does not necessarily have to be random, so a widely used source is the system clock. Such algorithms are called Pseudo-Random Number Generators (PRNG).

Some applications, such as cryptography, require generated sequences to be truly unpredictable without the knowledge of the external source(s) (s). A common PRNG may have its seed "guessed" after observing a finite number (and computationally speaking, not very large) of previously generated numbers. This is not a problem in domains such as simulations (as it is sufficient that the sequence seem random), but when the security of a system and/or the confidentiality of a communication depends on that randomness, a more sophisticated algorithm is needed (a CSPRNG - PRNG Cryptographically Safe).

One way to do this is to generate a sequence of pseudo-random numbers using as seed a secret key. This is done through a simple counter (zero, one, two...) where each element is encrypted or hashed with the aid of this key. This is even the way most flow ciphers (stream Cipher) work: generate a random sequence of bytes, and combine this sequence with the data using XOR.

When a secret seed is not available, then all that remains is to resort to other external sources, as mentioned earlier. Various events of the computer itself can be considered "unpredictable", such as: a) the keystrokes typed by the user or the movement of the mouse; b) the date of creation of the various files (or even their contents); c) the historical CPU usage data; etc. Not always the entropy (disorder, unpredictability) of these events is sufficient, but their combination with a PRNG can greatly increase the quality of it. As mentioned in another answer, the numbers generated by a common PRNG start repeating after a certain period. If this is combined with data from external sources (mixture, or Mixing) this period can be quite lengthy - which is sufficient for many practical applications.

Finally, if none of this is enough, only the use of hardware modules (probably involving quantum mechanics) or data from external systems remains. The site Random.org, for example, generates a sequence of random numbers based on atmospheric information (Atmospheric Noise) - which are so unpredictable with current technology that they can be considered "really random". Of course, one should not use them for confidential operations (such as generating passwords and keys) because they come from third parties, but for scientific applications or perhaps raffles in games of chance, they can be a good alternative.

46

The randomization made by Math.random() returns a positive value (greater than or equal to 0) of the type Number, but less than 1, it is randomly chosen or pseudo randomly with a uniform distribution approximately over this range, using an implementation-dependent strategy or algorithm (implementation-dependent).

Here is the implementation of it in the V8 engine used by Chrome to run Javascript:

uint32_t V8::Random() {

    // Gerador de Número aleatório utilizando o algoritmo MWC de George Marsaglia.
    static uint32_t hi = 0;
    static uint32_t lo = 0;

    // Inicialize a descendência utilizando o  system random(). Se uma das
    // descendências nunca deve tornar-se a zero novamente, ou se random() retorna zero
    // nós evitamos ficar presos com zero bits no hi ou no lo, reinicializando eles na demanda.
    if (hi == 0) hi = random();
    if (lo == 0) lo = random();

    // Misturar os bits.
    hi = 36969 * (hi & 0xFFFF) + (hi >> 16);
    lo = 18273 * (lo & 0xFFFF) + (lo >> 16);
    return (hi << 16) + (lo & 0xFFFF);
}

References:

Base(Source):

http://dl.packetstormsecurity.net/papers/general/Google_Chrome_3.0_Beta_Math.random_vulnerability.pdf

Original Stack Overflow Response (EN)

Also, here are some related questions in Stack Overflow (EN):

41

The answer is based on the concepts of "random seed" (or "Seed"), and "pseudorandom number generators".

Why "pseudo"? Because the random numbers that are appearing in your call to Math.random() are actually "almost" random, since they are based on a "seed". This seed is a "basis" for the generation of a group of numbers that seem random.

In general, it is based on the current date/time, because it is basically a number that never repeats, and because there is a strong element of randomness at small intervals (such as milliseconds).

There are also other generation techniques such as user-pressed keys on keyboard, etc.

36

One of the most commonly used algorithms in pseudo-random numbers by computers is GCL also considered to be one of the simplest is the so-called linear congruent generator (GCL), presented by D. H. Lehmer in 1949: consider integers m, a, c and s such that m > 0, 0 a < m, 0 a < m and 0 s < m. Putting x0 = s, set recursively for each integer n 0:


xn + 1 = (a xn + c) mod m

That is, Xn + 1 is the remainder of the division of a Xn + c by m. For example, if m = 10, a = 11, c = 3, and x0 = s = 3, then a x0 + c = 36. As the rest of the division of 36 by 10 is 6, it follows that X1 = 6. Thus the X1 + c = 69. As the rest of the division of 69 by 10 is 9, it follows that x2 = 9. Proceeding with this scheme, starting from the number x0 = 3, we generate the numbers X1 = 6, x2 = 9, X3 = 2, X4 = 5, X5 = 8, X6 = 1, X7 = 4, X8 = 8, X9 = 0, X10 = 3. Note that after X10, the sequence of numbers generated is repeated (with period 10).


response based on: http://en.wikipedia.org/wiki/Random_number_generation

Browser other questions tagged

You are not signed in. Login or sign up in order to post.