Question

我有以下Cython代码：

# cython: profile=True
import cProfile
from cython import parallel
from libc.stdio cimport FILE, fopen, fclose, fwrite, getline, printf
from libc.string cimport strlen
from libcpp.string cimport string
cdef extern from "stdio.h" nogil:
   int mkstemp(char*);

cdef run_io(string obj):
    cdef int i, dump
    cdef size_t len_ = 0
    cdef char* fname = "/tmp/tmpc_XXXXXX"
    cdef char* nullchar = NULL
    cdef char* line = NULL
    cdef string content = b""
    cdef FILE* cfile
    for i in range(10000):
        dump = mkstemp(fname)
        cfile = fopen(fname, "wb")
        fwrite(obj.data(), 1, obj.size(), cfile)
        fclose(cfile)
        cfile = fopen(fname, "rb")
        while True:
            if getline(&line, &len_, cfile) == -1:
                break
            else:
                content.append(line)
        fclose(cfile)

def run_test():
    cdef string obj = b"abc\ndef"
    cProfile.runctx("run_io(obj)", globals(), locals())

当我尝试从python3控制台运行它时，我收到错误：

NameError: name 'run_io' is not defined

如果我将run_io函数cdef的定义更改为def，则可行：

         7 function calls in 2.400 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.400    2.400 <string>:1(<module>)
        2    0.000    0.000    0.000    0.000 stringsource:13(__pyx_convert_string_from_py_std__in_string)
        1    2.400    2.400    2.400    2.400 testc2.pyx:10(run_io)
        1    0.000    0.000    2.400    2.400 {built-in method exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    2.400    2.400 {test.run_io}

然而，这并不是非常有用，因为我只看到整个函数的总运行时间（我希望看到生成文件名，读取，写入等的部分运行时）。

因此，我有两个问题：

是否可以分析Cython个功能（用cdef定义）？如果是，怎么做？
如何使分析更具信息性（即测量每个被调用函数的时间）？

Answer 1

python和cpython profiling是 deterministic ，这意味着它们通过在函数的进入和退出时捕获wall-time来工作，但只有profiler被告知要分析的函数。他们无法为您提供逐行计时信息，除非他们也赶上每行代码之前和/或之后的时间。

获取逐行信息的一种方法是，如果你不介意放弃测量精度（并不是所有它都被破解），就是采用少量的堆栈样本。每行代码都包含时间（ cumtime ）作为总时间的一小部分，这正是它在堆栈上的时间的一小部分。因此，当您采用堆栈样本时，该分数是您将看到它的概率。如果您查看10或20个样本，您可以很好地了解哪些代码行花费了大量时间。 Here's an example.

:)我有时会听到的一个反对意见是“不会完全放慢程序并使结果失效吗？”好吧，想一想。有一行代码，它需要一些时间，如36.5％，所以它在堆栈上的那段时间。现在你启动程序，9秒后你打断它来查看堆栈。这条线在堆栈上有36.5％的可能性。现在，门铃响了，直到一周之后你才回头看它。那个为期一周的延迟是否会改变该行在堆栈样本上的概率？当然不是。电脑很耐心。无论您花多长时间查看它，堆栈样本都是不变的。

分析Cython代码

1 个答案: