Skip to content

Race condition in itertools.islice under free-threading #151409

@KowalskiThomas

Description

@KowalskiThomas

Bug report

Bug description:

islice_next reads and writes three fields -- lz->cnt, lz->next, and lz->it -- and do operations on them without a critical section.

static PyObject *
islice_next(PyObject *op)
{
isliceobject *lz = isliceobject_CAST(op);
PyObject *item;
PyObject *it = lz->it;
Py_ssize_t stop = lz->stop;
Py_ssize_t oldnext;
PyObject *(*iternext)(PyObject *);
if (it == NULL)
return NULL;
iternext = *Py_TYPE(it)->tp_iternext;
while (lz->cnt < lz->next) {
item = iternext(it);
if (item == NULL)
goto empty;
Py_DECREF(item);
lz->cnt++;
}
if (stop != -1 && lz->cnt >= stop)
goto empty;
item = iternext(it);
if (item == NULL)
goto empty;
lz->cnt++;
oldnext = lz->next;
/* The (size_t) cast below avoids the danger of undefined
behaviour from signed integer overflow. */
lz->next += (size_t)lz->step;
if (lz->next < oldnext || (stop != -1 && lz->next > stop))
lz->next = stop;
return item;

Two threads calling next on the same islice object concurrently can race:

  • Both read lz->cnt = N < lz->next, both execute the skip-loop body for the same slot, advancing the underlying iterator twice for one step. In that case, the step arithmetic gets corrupted.
  • Both read lz->cnt = N < stop, both pass the stop check, both call iternext(it) and return an item -- the total number of yielded items exceeds stop.
  • There's a race on lz->cnt++ and lz->next += step: both threads can read the same value, then both can increment locally, and both store the same result. After that, the counter is permanently incorrect (such that subsequent skip/stop decisions use a wrong baseline).
  • One of the threads reaches exhaustion (goto empty) and calls Py_CLEAR(lz->it), which sets lz->it = NULL and DECREF's the iterator, which frees it (since each thread has ready it = lz->it. and not INCREF'd it). Another thread can have read lz->it before the clear and be in the middle of iternext(it) on the now-freed object in which case there is a use-after-free.

chain_next for example wraps its body in Py_BEGIN_CRITICAL_SECTION(op), which I believe is what islice should do as well.

static PyObject *
chain_next(PyObject *op)
{
PyObject *result;
Py_BEGIN_CRITICAL_SECTION(op);
result = chain_next_lock_held(op);
Py_END_CRITICAL_SECTION()
return result;
}

Reproducer

import itertools
import threading

STOP = 100
NTHREADS = 8

data = iter(range(STOP + NTHREADS * 2))
sl = itertools.islice(data, STOP)

results: list[int] = []
lock = threading.Lock()

def consume() -> None:
    while True:
        v = next(sl, None)
        if v is None:
            break
        with lock:
            results.append(v)

threads = [threading.Thread(target=consume) for _ in range(NTHREADS)]
for t in threads: t.start()
for t in threads: t.join()

The reproducer, on a free-threaded build, gets reports like this one.

WARNING: ThreadSanitizer: data race (pid=49886)
  Read of size 8 at 0x00030275f8b0 by thread T2:
    #0 islice_next itertoolsmodule.c:1663 (python.exe:arm64+0x10042bce4)
    #1 builtin_next bltinmodule.c:1770 (python.exe:arm64+0x10028a900)
    #2 cfunction_vectorcall_FASTCALL methodobject.c:449 (python.exe:arm64+0x1001379dc)
    #3 _PyObject_VectorcallTstate pycore_call.h:144 (python.exe:arm64+0x100090e80)
    #4 PyObject_Vectorcall call.c:327 (python.exe:arm64+0x100090e80)
...

CPython versions tested on:

CPython main branch

Operating systems tested on:

macOS

Linked PRs

Metadata

Metadata

Assignees

No one assigned
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions