Question

有没有办法描述一个python进程＆＃39;使用GIL ？基本上，我想找出 GIL持有的时间百分比。该过程是单线程的。

我的动机是我有一些用Cython编写的代码，它使用R> p <- as.POSIXlt(.POSIXct(1486629519, tz="GMT")) R> p$year + 1900 # years since 1900 [1] 2017 R> p$mon + 1 # months are zero-based [1] 2 R> p$mday # mday is day of month; day of year is also available [1] 9 R> p$hour [1] 8 R> p$min [1] 38 R> p$sec [1] 39。理想情况下，我想在多线程进程中运行它，但为了知道这是否可能是一个好主意，我需要知道GIL是否在大量时间内是免费的。

我从8年前发现了this related question。唯一的答案是＆＃34; No＆＃34;。希望事情发生变化。

Answer 1

完全偶然的是，我找到了一个工具：gil_load。

在发布之后实际发布了。

干得好，@ chrisjbillington。

>>> import sys, math >>> import gil_load >>> gil_load.init() >>> gil_load.start(output = sys.stdout) >>> for x in range(1, 1000000000): ... y = math.log(x**math.pi) [2017-03-15 08:52:26] GIL load: 0.98 (0.98, 0.98, 0.98) [2017-03-15 08:52:32] GIL load: 0.99 (0.99, 0.99, 0.99) [2017-03-15 08:52:37] GIL load: 0.99 (0.99, 0.99, 0.99) [2017-03-15 08:52:43] GIL load: 0.99 (0.99, 0.99, 0.99) [2017-03-15 08:52:48] GIL load: 1.00 (1.00, 1.00, 1.00) [2017-03-15 08:52:52] GIL load: 1.00 (1.00, 1.00, 1.00) <...> >>> import sys, math >>> import gil_load >>> gil_load.init() >>> gil_load.start(output = sys.stdout) >>> for x in range(1, 1000000000): ... with open('/dev/null', 'a') as f: ... print(math.log(x**math.pi), file=f) [2017-03-15 08:53:59] GIL load: 0.76 (0.76, 0.76, 0.76) [2017-03-15 08:54:03] GIL load: 0.77 (0.77, 0.77, 0.77) [2017-03-15 08:54:09] GIL load: 0.78 (0.78, 0.78, 0.78) [2017-03-15 08:54:13] GIL load: 0.80 (0.80, 0.80, 0.80) [2017-03-15 08:54:19] GIL load: 0.81 (0.81, 0.81, 0.81) [2017-03-15 08:54:23] GIL load: 0.81 (0.81, 0.81, 0.81) [2017-03-15 08:54:28] GIL load: 0.81 (0.81, 0.81, 0.81) [2017-03-15 08:54:33] GIL load: 0.80 (0.80, 0.80, 0.80) <...>

Answer 2

如果您想知道GIL的使用次数，可以使用gdb断点。例如：

> cat gil_count_example.py
import sys
import threading
from threading import Thread

def worker():
    k=0
    for j in range(10000000):
        k+=j
    return

num_threads = int(sys.argv[1])
threads = []
for i in range(num_threads):
    t = Thread(target = worker)
    t.start()
    threads.append(t)

for t in threads:
    t.join()

对于take_gil

的3.X中断

> cgdb --args python3 gil_count_example.py 8
(gdb) b take_gil
(gdb) ignore 1 100000000
(gdb) r
(gdb) info breakpoints
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x00007ffff7c85f10 in take_gil
                                                   at Python-3.4.3/Python/ceval_gil.h:208
        breakpoint already hit 1886 times

对于PyThread_acquire_lock

的2.X休息

> cgdb --args python2 gil_count_example.py 8
(gdb) b PyThread_acquire_lock
(gdb) ignore 1 100000000
(gdb) r
(gdb) info breakpoints
Num     Type           Disp Enb Address            What
  1       breakpoint     keep y   0x00000039bacfd410 
        breakpoint already hit 1584561 times

一个高效的穷人的探查器也可用于描述在功能中花费的时间，我使用https://github.com/knielsen/knielsen-pmp

 > ./get_stacktrace --max=100 --freq=10 `/sbin/pidof python2`
 ...
 292  71.92% sem_wait:PyThread_acquire_lock

> ./get_stacktrace --max=100 --freq=10 `/sbin/pidof python3`
...
557  77.68%  pthread_cond_timedwait:take_gil

Answer 3

我不知道这样的工具。

但是有一些启发式方法可以帮助您猜测多线程是否会有所帮助。您可能知道，GIL将在IO操作期间发布，有些调用本机代码，尤其是第三方本机模块。如果你没有那么多代码，那么多线程可能无法帮助你。

如果您有IO /本机代码，那么您可能只需要尝试一下。根据代码库转换整个事物以利用多个线程可能需要做很多工作，因此您可能希望尝试将多线程应用于您知道IO /本机代码被调用的部分，并进行测量以查看是否你有任何改进。

根据您的使用情况，多处理可以适用于主要受CPU限制的情况。多处理会增加开销，因此对于持续相对较长时间（几秒或更长）的CPU绑定任务来说，这通常是一种很好的方法。

分析GIL

3 个答案: