Question

我正在使用Python的C API，但很难理解一些极端情况。我可以测试它，但它似乎容易出错并且耗费时间。所以我来这里看看是否有人已经这样做了。

问题是，哪个是使用子解释器管理多线程的正确方法，线程和子解释器之间没有直接关系？

Py_Initialize();
PyEval_InitThreads(); /* <-- needed? */
_main = PyEval_SaveThread(); /* <-- acquire lock? does it matter? */
/* maybe do I not need it? */
i1 = Py_NewInterpreter();
i2 = Py_NewInterpreter();

我是否使用互斥锁？是否需要使用锁？线程函数应该类似于以下内容:(线程是非python，可能是POSIX线程）

线程1

_save = PyThreadState_Swap(i1);
  // python work 
PyThreadState_Restore(_save);

Thread2 （几乎相同）

_save = PyThreadState_Swap(i1);
  // python work 
PyThreadState_Restore(_save);

Thread3 （几乎完全相同，但使用子解释器i2）

_save = PyThreadState_Swap(i2);
  // python work 
PyThreadState_Restore(_save);

这是对的吗？这是我想要实现的一般情况吗？有竞争条件吗？

谢谢！

Answer 1

Python中的子解释器没有很好的文档记录，甚至没有得到很好的支持。以下是我未完成的最好的。它似乎在实践中运作良好。

在Python中处理线程和子解释器时，要理解的是两个重要的概念。首先，Python解释器并不是真正的多线程。它有一个全局解释器锁（GIL），需要获取才能执行几乎任何Python操作（此规则有一些罕见的例外）。

其次，线程和子解释器的每个组合都必须有自己的线程状态。解释器为它管理的每个线程创建一个线程状态，但是如果你想从不是由该解释器创建的线程使用Python，你需要创建一个新的线程状态。

首先，您需要创建子解释器：

初始化Python

Py_Initialize();

初始化Python线程支持

如果您打算从多个线程调用Python，则必需）。此调用还获得了GIL。

PyEval_InitThreads();

保存当前线程状态

我本可以使用PyEval_SaveThread()，但其中一个副作用是释放GIL，然后需要重新获取。

PyThreadState* _main = PyThreadState_Get();

创建子解释器

PyThreadState* ts1 = Py_NewInterpreter();
PyThreadState* ts2 = Py_NewInterpreter();

恢复主解释器线程状态

PyThreadState_Swap(_main);

我们现在有两个子解释器的线程状态。这些线程状态仅在创建它们的线程中有效。每个想要使用其中一个子解释器的线程都需要为该线程和解释器的组合创建一个线程状态。

使用新主题中的子解释器

下面是在新线程中使用子解释器的示例代码，该新解析器不是由子解释器创建的。新线程必须获取GIL，为线程创建新线程状态并解释组合并使其成为当前线程状态。最后必须进行反向清理。

void do_stuff_in_thread(PyInterpreterState* interp)
{
    // acquire the GIL
    PyEval_AcquireLock(); 

    // create a new thread state for the the sub interpreter interp
    PyThreadState* ts = PyThreadState_New(interp);

    // make ts the current thread state
    PyThreadState_Swap(ts);

    // at this point:
    // 1. You have the GIL
    // 2. You have the right thread state - a new thread state (this thread was not created by python) in the context of interp

    // PYTHON WORK HERE

    // release ts
    PyThreadState_Swap(NULL);

    // clear and delete ts
    PyThreadState_Clear(ts);
    PyThreadState_Delete(ts);

    // release the GIL
    PyEval_ReleaseLock(); 
}

现在每个线程都可以执行以下操作：

<强>线程1

do_stuff_in_thread(ts1->interp);

<强>线程2

do_stuff_in_thread(ts1->interp);

<强> Thread3

do_stuff_in_thread(ts2->interp);

调用Py_Finalize()会破坏所有子解释器。或者，可以手动销毁。这需要在主线程中使用创建子解释器时创建的线程状态来完成。最后使主解释器线程状态为当前状态。

// make ts1 the current thread state
PyThreadState_Swap(ts1);
// destroy the interpreter
Py_EndInterpreter(ts1);

// make ts2 the current thread state
PyThreadState_Swap(ts2);
// destroy the interpreter
Py_EndInterpreter(ts2);

// restore the main interpreter thread state
PyThreadState_Swap(_main);

我希望这会让事情变得更加清晰。

我在github上用C ++编写了一个小的完整示例。

Answer 2

只需要在@sterin 的回答中指出一个问题，部分 Using a sub interpreter from a new thread (post Python 3.3)

PyThreadState_New 必须在持有 GIL 时调用
PyEval_RestoreThread 将获取 GIL，因此不能在保持 GIL 的情况下调用它，否则会出现死锁。

因此，在这种情况下，您需要使用 PyThreadState_Swap 而不是 PyEval_RestoreThread

此外，您可以验证正在使用哪个解释器

int64_t interp_id = PyInterpreterState_GetID(interp);

Python多线程多解释器C API

2 个答案: