Question

我有一个python函数，它遍历numpy数组X_disc和python nametuples列表rule_list。

由于在第二个for循环中使用了namedtuple，我可以在带注释的文件中看到性能下降。

在这种情况下，最容易改善的部分（挂果最少的）是什么？
我也看不到为什么第一个for循环为黄色。理论上，x已预先定义为cdef long[:] x。可能没有正确执行。我应该怎么做才能避免这种情况？

[！[在此处输入图片描述] [1]] [1]

黄线表示Python交互。单击以“ +”开头的行，以查看Cython为其生成的C代码。

 01: 
+02: import numpy as np
 03: cimport numpy as cnp
 04: 
 05: cimport cython
 06: 
 07: #@cython.boundscheck(False)
 08: @cython.boundscheck(False)
+09: cpdef long[:,:] cy_count_lhs_verified_12(cnp.ndarray[long, ndim=2] X_disc,
 10:                                          list rule_list,
 11:                                          int n_rhs,
 12:                                          dict class_to_pos):
 13: 
+14:     cdef long n_samples = X_disc.shape[0]
+15:     cdef long n_features = X_disc.shape[1]
+16:     cdef int m = 0
+17:     cdef int n_verified = 0
 18:     cdef long[:] x
+19:     cdef cnp.ndarray R = np.zeros([n_samples, n_rhs], dtype=np.int64)
 20:     cdef int p
 21:     cdef size_t l
 22: 
+23:     for x in X_disc:
+24:         for rule in rule_list:
 25:             # X_disc is an array
 26:             # rule_list is a list of python named tuples
+27:             l = len(rule.LHS)
 28: 
+29:             n_verified = 0
+30:             for (f,v) in rule.LHS:
+31:                 if x[f] == v:
+32:                     n_verified += 1
 33:                 else:
+34:                     break
 35: 
+36:             if n_verified == l:
 37:                 # Recall rule[0] is the RHS
+38:                 p = class_to_pos[rule.RHS[1]]
+39:                 R[m,0 ] += 1
 40: 
+41:         m +=1
 42: 
+43:     return R

如何改善对命名元组列表进行迭代的cython函数

0 个答案: