How can you check if the number is in between so fast?

Asked

Viewed 471 times

10

It is known that with the module timeit it is possible to measure, in Python, the execution time of code snippets. Curious, I was testing what is the time it takes to verify if a certain number is in an interval defined by range, just as x in range(A, B).

For the test, I did:

$ python -m timeit -s '1000000000 in range(9999999999)'

That is, I defined the interval [0, 9999999999[, which is gigantic, and I checked if the value 1000000000 belongs to it. The result obtained was:

100000000 loops, best of 3: 0.00574 usec per loop

Which, in addition to resulting in a rather small time, resembles, for example, the time of checking with a much smaller interval:

$ python -m timeit -s '1 in range(10)'
100000000 loops, best of 3: 0.00577 usec per loop

How is that possible?

1 answer

13


In fact, considering only the object of type range, the time to check whether a given value belongs to the range is, in the rating big-O, O(1), which means that it will be constant regardless of the size of the space verified, which explains the times of both tests being equal. And the explanation for this is very simple: mathematically it’s very easy to see if a number belongs to an interval.

TL;DR

The most basic solution I can imagine to check if a number is in a certain range is to go through all the elements of this range, comparing one to one with the value I am checking and, if I find an equal, return true, otherwise false.

def pertence(sequencia, valor):
    for item in sequencia:
        if item == valor:
            return True
    return False

Certainly, such a solution would never be O(1), in fact O(n), which could justify small times for small intervals, but would not justify the same for large intervals. So how is it possible?

Mathematically, we can say that the number x belongs to the range [A, B[ if, and only if, x is greater than or equal to A and x is less than B. That is to say:

inserir a descrição da imagem aqui

All this magic happens in the object range thanks to the implementation of the __contains__, which is implicitly invoked through the operator in. And, in addition to checking whether the value of x is greater than or equal to A and minor to B, when defined the interval step, it will also be necessary to check if the value matches the step. For example, if I define the object range(0, 10, 2), I will be taking the even values from 0 to 10 and, this way, the number 3 will not belong to the range, even if it is greater than 0 and less than 10. Thus, the Python implementation would be similar to:

class range:
    def __contains__(self, x):
        if not self.A <= x < self.B:
            return False
        if self.passo is not None:
            return (x - self.A) % self.passo == 0
        return True

Thus, the verification no longer depends on the size of the interval, but on a constant number of operations, which characterizes an O solution(1).

inserir a descrição da imagem aqui

See working on Repl.it

However, the class range python belongs to the module builtins, which is implemented in C, not Python. That is, although the logic is the same, the code that is executed, in fact, to verify if the value belongs to the range is another. As a matter of curiosity, it is possible to access the official repository of Cpython and verify the implementation in C, but specifically between lines 337 and 379:

static int
range_contains_long(rangeobject *r, PyObject *ob)
{
    int cmp1, cmp2, cmp3;
    PyObject *tmp1 = NULL;
    PyObject *tmp2 = NULL;
    int result = -1;

    /* Check if the value can possibly be in the range. */

    cmp1 = PyObject_RichCompareBool(r->step, _PyLong_Zero, Py_GT);
    if (cmp1 == -1)
        goto end;
    if (cmp1 == 1) { /* positive steps: start <= ob < stop */
        cmp2 = PyObject_RichCompareBool(r->start, ob, Py_LE);
        cmp3 = PyObject_RichCompareBool(ob, r->stop, Py_LT);
    }
    else { /* negative steps: stop < ob <= start */
        cmp2 = PyObject_RichCompareBool(ob, r->start, Py_LE);
        cmp3 = PyObject_RichCompareBool(r->stop, ob, Py_LT);
    }

    if (cmp2 == -1 || cmp3 == -1) /* TypeError */
        goto end;
    if (cmp2 == 0 || cmp3 == 0) { /* ob outside of range */
        result = 0;
        goto end;
    }

    /* Check that the stride does not invalidate ob's membership. */
    tmp1 = PyNumber_Subtract(ob, r->start);
    if (tmp1 == NULL)
        goto end;
    tmp2 = PyNumber_Remainder(tmp1, r->step);
    if (tmp2 == NULL)
        goto end;
    /* result = ((int(ob) - start) % step) == 0 */
    result = PyObject_RichCompareBool(tmp2, _PyLong_Zero, Py_EQ);
  end:
    Py_XDECREF(tmp1);
    Py_XDECREF(tmp2);
    return result;
}

The parameter r represents the object itself range and the parameter ob represents the value to be verified. The object r owns the fields start, stop and step that represent the beginning, end and step of the interval. Although it is a larger implementation, the logic is exactly the one presented previously in Python.

In the C implementation, it is also possible to verify that the object r is not modified at any time, which implies that the generator that is defined by range will not be modified by checking if the value belongs to the range. Being a generator and, by definition, generator can be iterated only once, it would be catastrophic if the operator in travel the range, because the interval would no longer be useful after that. Note that we are dealing here only with the object range and its peculiarities, which does not imply to be so with all Python generators.

intervalo = range(10)

if 5 in intervalo:
    for i in intervalo:
        print(i)

See working on Repl.it

  • I don’t understand the math operation (x - self.A) % self.passo == 0 in the case of the variable passo contain value. What would be?

  • 1

    @cat for example, range(10, 20, 2), the step value would be 2, so to check if 15 belongs, it would be (15 - 10) %2 == 0, false, while for 16 would be (16 - 10) %2 == 0, true.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.