几周前,我问了一个关于提高用Python编写的函数速度的问题。那时,TryPyPy引起了我注意使用Cython这样做的可能性。他还举了一个例子,说明我如何Cython化该代码片段。我想对下面的代码做同样的事情,看看通过声明变量类型我能做多快。我有几个与此相关的问题。我在cython.org上看过教程,但我还是有一些问题。他们密切相关:
double
来获取Python中的float
。我该怎么做列表?通常,我在哪里可以找到给定Python类型的相应C类型。 任何关于如何对下面的代码进行Cython化的示例都会非常有用。我在代码中插入了注释,提供了有关变量类型的信息。
class Some_class(object):
** Other attributes and functions **
def update_awareness_status(self, this_var, timePd):
'''Inputs: this_var (type: float)
timePd (type: int)
Output: None'''
max_number = len(self.possibilities)
# self.possibilities is a list of tuples.
# Each tuple is a pair of person objects.
k = int(math.ceil(0.3 * max_number))
actual_number = random.choice(range(k))
chosen_possibilities = random.sample(self.possibilities,
actual_number)
if len(chosen_possibilities) > 0:
# chosen_possibilities is a list of tuples, each tuple is a pair
# of person objects. I have included the code for the Person class
# below.
for p1,p2 in chosen_possibilities:
# awareness_status is a tuple (float, int)
if p1.awareness_status[1] < p2.awareness_status[1]:
if p1.value > p2.awareness_status[0]:
p1.awareness_status = (this_var, timePd)
else:
p1.awareness_status = p2.awareness_status
elif p1.awareness_status[1] > p2.awareness_status[1]:
if p2.value > p1.awareness_status[0]:
p2.awareness_status = (price, timePd)
else:
p2.awareness_status = p1.awareness_status
else:
pass
class Person(object):
def __init__(self,id, value):
self.value = value
self.id = id
self.max_val = 50000
## Initial awareness status.
self.awarenessStatus = (self.max_val, -1)
答案 0 :(得分:7)
总的来说,通过运行带有cython
“annotate”选项的-a
命令,您可以准确地看到Cython为每个源代码生成的C代码。有关示例,请参阅Cython documentation。在尝试查找函数体内的瓶颈时,这非常非常有用。
此外,在Cython运行代码时,有"early binding for speed"的概念。 Python对象(如下面的Person
类的实例)使用通用Python代码进行属性访问,这在内循环中很慢。我怀疑如果你将Person
类更改为cdef class
,那么你会看到一些加速。此外,您需要在内部循环中键入p1
和p2
对象。
由于你的代码有很多Python调用(例如random.sample
),你可能不会获得巨大的加速,除非你找到一种方法将这些行放入C,这需要花费很多精力。 / p>
您可以将内容输入为tuple
或list
,但这通常不会意味着加速。最好尽可能使用C数组;你必须要查找的东西。
我通过以下微不足道的修改获得了1.6倍的加速因子。请注意,我必须在这里和那里更改一些内容以使其编译。
ctypedef int ITYPE_t
cdef class CyPerson:
# These attributes are placed in the extension type's C-struct, so C-level
# access is _much_ faster.
cdef ITYPE_t value, id, max_val
cdef tuple awareness_status
def __init__(self, ITYPE_t id, ITYPE_t value):
# The __init__ function is much the same as before.
self.value = value
self.id = id
self.max_val = 50000
## Initial awareness status.
self.awareness_status = (self.max_val, -1)
NPERSONS = 10000
import math
import random
class Some_class(object):
def __init__(self):
ri = lambda: random.randint(0, 10)
self.possibilities = [(CyPerson(ri(), ri()), CyPerson(ri(), ri())) for i in range(NPERSONS)]
def update_awareness_status(self, this_var, timePd):
'''Inputs: this_var (type: float)
timePd (type: int)
Output: None'''
cdef CyPerson p1, p2
price = 10
max_number = len(self.possibilities)
# self.possibilities is a list of tuples.
# Each tuple is a pair of person objects.
k = int(math.ceil(0.3 * max_number))
actual_number = random.choice(range(k))
chosen_possibilities = random.sample(self.possibilities,
actual_number)
if len(chosen_possibilities) > 0:
# chosen_possibilities is a list of tuples, each tuple is a pair
# of person objects. I have included the code for the Person class
# below.
for persons in chosen_possibilities:
p1, p2 = persons
# awareness_status is a tuple (float, int)
if p1.awareness_status[1] < p2.awareness_status[1]:
if p1.value > p2.awareness_status[0]:
p1.awareness_status = (this_var, timePd)
else:
p1.awareness_status = p2.awareness_status
elif p1.awareness_status[1] > p2.awareness_status[1]:
if p2.value > p1.awareness_status[0]:
p2.awareness_status = (price, timePd)
else:
p2.awareness_status = p1.awareness_status
答案 1 :(得分:1)
C并不直接了解列表的概念。
基本数据类型为int
(char
,short
,long
),float
/ double
(所有这些都具有非常简单的映射关系python)和指针。
如果指针的概念对您而言是新的,请查看:Wikipedia:Pointers
在某些情况下,指针可用作元组/数组替换。字符指针是所有字符串的基础。
假设您有一个整数数组,然后将其存储为具有起始地址的连续内存块,您可以定义类型(int
)并且它是指针(*
):< / p>
cdef int * array;
现在您可以像这样访问数组的每个元素:
array[0] = 1
但是,必须分配内存(例如使用malloc
)并且高级索引将不起作用(例如array[-1]
将是内存中的随机数据,这也适用于超出保留宽度的索引空间)。
更复杂的类型不直接映射到C,但通常有一种C方式可以执行某些可能不需要python类型的方法(例如for循环不需要范围数组/迭代器)。
正如您自己注意到的那样,编写好的cython代码需要更详细的C知识,因此前进到教程可能是最好的下一步。