分配和复制Numpy数组的时间异常

时间:2018-12-15 21:12:14

标签: python arrays numpy time

我最近发现了分配和复制Numpy数组的问题:

数组分配花费固定时间(数组大小);将另一个数组的内容复制到分配的数组中也要花费一些时间,该时间会随着数组大小的增加而增加。但是,问题在于, 操作,分配和复制所花费的时间不仅仅是两个时间的 sum 操作(请参见下图):

t(allocation + copy) > t(allocation) + t(copy)

我看不到额外时间流逝的原因(随着时间的增加,这会迅速增加)。

Numpy allocation + copy

这是我用于计时的代码。在Debian Stretch下使用Intel Core i3 CPU(2.13 GHz)进行计时。

import numpy as np
import gc
from timeit import default_timer as timer
import matplotlib.pyplot as plt

def time_all(dim1):
   N_TIMES = 10
   shape = (dim1, dim1)
   data_1 = np.empty(shape, np.int16)
   data_2 = np.random.randint(0, 2**14, shape, np.int16)

   # allocate array
   t1 = timer()
   for _ in range(N_TIMES):
      data_1 = np.empty(shape, np.int16)
   alloc_time = (timer() - t1) / N_TIMES

   # copy array
   t1 = timer()
   for _ in range(N_TIMES):
      data_1[:] = data_2
   copy_time = (timer() - t1) / N_TIMES

   # allocate & copy array 
   t1 = timer()
   for _ in range(N_TIMES):
      data_3 = np.empty(shape, np.int16)
      np.copyto(data_3, data_2)
   alloc_copy_time = (timer() - t1) / N_TIMES

   return alloc_time, copy_time, alloc_copy_time
#END def

# measure elapsed times
gc.disable() # disable automatic garbage collection
times_elapsed = np.array([(size, ) + time_all(size)
                for size in np.logspace(2, 14, 1<<8,
                endpoint=True, base=2, dtype=int)])
gc.enable()

# plot results
plt.plot(times_elapsed[:,0], times_elapsed[:,1], marker='+', lw=0.5, label="alloc")
plt.plot(times_elapsed[:,0], times_elapsed[:,2], marker='+', lw=0.5, label="copy")
plt.plot(times_elapsed[:,0], times_elapsed[:,3], marker='+', lw=0.5, label="alloc&copy")
plt.xlabel("array dim.")
plt.legend()
plt.savefig("alloc_copy_time.svg")

0 个答案:

没有答案