考虑以下直接的python扩展。当start()-ed
时,Foo
只会将下一个连续整数添加到py::list
,每秒一次:
#include <boost/python.hpp>
#include <thread>
#include <atomic>
namespace py = boost::python;
struct Foo {
Foo() : running(false) { }
~Foo() { stop(); }
void start() {
running = true;
thread = std::thread([this]{
while(running) {
std::cout << py::len(messages) << std::end;
messages.append(py::len(messages));
std::this_thread::sleep_for(std::chrono::seconds(1));
}
});
}
void stop() {
if (running) {
running = false;
thread.join();
}
}
std::thread thread;
py::list messages;
std::atomic<bool> running;
};
BOOST_PYTHON_MODULE(Foo)
{
PyEval_InitThreads();
py::class_<Foo, boost::noncopyable>("Foo",
py::init<>())
.def("start", &Foo::start)
.def("stop", &Foo::stop)
;
}
鉴于上述情况,以下简单的python脚本始终存在段错误,甚至从未打印过任何内容:
>>> import Foo
>>> f = Foo.Foo()
>>> f.start()
>>> Segmentation fault (core dumped)
核心指向:
namespace boost { namespace python {
inline ssize_t len(object const& obj)
{
ssize_t result = PyObject_Length(obj.ptr());
if (PyErr_Occurred()) throw_error_already_set(); // <==
return result;
}
}} // namespace boost::python
其中:
(gdb) inspect obj
$1 = (const boost::python::api::object &) @0x62d368: {<boost::python::api::object_base> = {<boost::python::api::object_operators<boost::python::api::object>> = {<boost::python::def_visitor<boost::python::api::object>> = {<No data fields>}, <No data fields>}, m_ptr = []}, <No data fields>}
(gdb) inspect obj.ptr()
$2 = []
(gdb) inspect result
$3 = 0
为什么在线程中运行会失败? obj
看起来不错,result
设置正确。为什么PyErr_Occurred()
会发生?是谁设定的?
答案 0 :(得分:10)
简而言之,CPython解释器周围有一个互斥锁,称为Global Interpreter Lock(GIL)。此互斥锁可防止对Python对象执行并行操作。因此,在任何时间点,允许最多一个线程(已获取GIL的线程)对Python对象执行操作。当存在多个线程时,调用Python代码而不保存GIL会导致未定义的行为。
C或C ++线程有时在Python文档中称为外来线程。 Python解释器无法控制外来线程。因此,外来线程负责管理GIL以允许与Python线程并发或并行执行。考虑到这一点,让我们检查原始代码:
while (running) {
std::cout << py::len(messages) << std::endl; // Python
messages.append(py::len(messages)); // Python
std::this_thread::sleep_for(std::chrono::seconds(1)); // No Python
}
如上所述,当线程拥有GIL时,线程主体中的三条线中只有两条线需要运行。处理此问题的一种常见方法是使用RAII类来帮助管理GIL。例如,使用以下gil_lock
类,当创建gil_lock
对象时,调用线程将获取GIL。当gil_lock
对象被破坏时,它会释放GIL。
/// @brief RAII class used to lock and unlock the GIL.
class gil_lock
{
public:
gil_lock() { state_ = PyGILState_Ensure(); }
~gil_lock() { PyGILState_Release(state_); }
private:
PyGILState_STATE state_;
};
然后,线程主体可以使用显式范围来控制锁的生存期。
while (running) {
// Acquire GIL while invoking Python code.
{
gil_lock lock;
std::cout << py::len(messages) << std::endl;
messages.append(py::len(messages));
}
// Release GIL, allowing other threads to run Python code while
// this thread sleeps.
std::this_thread::sleep_for(std::chrono::seconds(1));
}
以下是基于原始代码的完整示例,demonstrates一旦明确管理GIL,程序就能正常运行:
#include <thread>
#include <atomic>
#include <iostream>
#include <boost/python.hpp>
/// @brief RAII class used to lock and unlock the GIL.
class gil_lock
{
public:
gil_lock() { state_ = PyGILState_Ensure(); }
~gil_lock() { PyGILState_Release(state_); }
private:
PyGILState_STATE state_;
};
struct foo
{
foo() : running(false) {}
~foo() { stop(); }
void start()
{
namespace python = boost::python;
running = true;
thread = std::thread([this]
{
while (running)
{
{
gil_lock lock; // Acquire GIL.
std::cout << python::len(messages) << std::endl;
messages.append(python::len(messages));
} // Release GIL.
std::this_thread::sleep_for(std::chrono::seconds(1));
}
});
}
void stop()
{
if (running)
{
running = false;
thread.join();
}
}
std::thread thread;
boost::python::list messages;
std::atomic<bool> running;
};
BOOST_PYTHON_MODULE(example)
{
// Force the GIL to be created and initialized. The current caller will
// own the GIL.
PyEval_InitThreads();
namespace python = boost::python;
python::class_<foo, boost::noncopyable>("Foo", python::init<>())
.def("start", &foo::start)
.def("stop", &foo::stop)
;
}
交互式使用:
>>> import example
>>> import time
>>> foo = example.Foo()
>>> foo.start()
>>> time.sleep(3)
0
1
2
>>> foo.stop()
>>>