应用错误收集

numba实现str.find（）的速度比纯python慢吗？

时间：2020-05-07 20:09:35

标签： python numba

str.find（）的纯python代码比其numba实现要快吗？
numba == 0.48.0（无法加载0.49.0，似乎有错误）

from timeit import default_timer as timer
from numba import jit,njit

def search_match(a,search,n):
   for z in range(n):
      i = a.find(search)
   return i

@njit
def search_match_jit(a,search,n):
   for z in range(n):
      i = a.find(search)
   return i

n = 10000000
a  = '.56485.36853.32153.65646.34763.23152.11321.65886.54975.12781.'
search = '2315'

print('Str.find:')
start = timer()
i = search_match(a,search,n)
print(timer() - start)

i = search_match_jit(a,search,1) # precompile
print('Jit:')
start = timer()
i = search_match_jit(a,search,n)
print(timer() - start)

1 个答案:

答案 0 :(得分：2)

str.find的内置CPython实现不是“纯Python”-它已经用C语言编写：https://github.com/python/cpython/blob/master/Objects/stringlib/find.h

这不是我们期望Numba加速的事情。的确，由于Numba还需要处理其他复杂问题，因此速度变慢也就不足为奇了。请参见Numba documentation中的以下“警告”，在其中我将最后一句话加粗显示以供强调：

众所周知，某些操作的性能比CPython实现要慢。其中包括子字符串搜索（in，.contains()和find()）和字符串创建（如.split()）。改善字符串性能是一项持续的工作，但是对于单独的基本字符串操作，CPython的速度不可能超过它。 Numba最成功地用于碰巧涉及字符串的大型算法，而基本的字符串操作并不是瓶颈。

基本上，Numba开发人员将字符串方法添加到nopython模式，以便用户可能更容易（可能有几行代码碰巧将字符串与重型数字代码混合在一起）来编译其代码，而无需任何重新设计。但是Numba并不是要加快字符串代码的速度：它的目标是重型数字内容，而字符串支持只是为了方便。