高效的python数组到numpy数组转换

时间:2011-04-15 09:41:52

标签: python numpy

我从python标准库获得了数组格式的大数组(12 Mpix图像)。 由于我想对这些数组执行操作,我希望将其转换为numpy数组。 我尝试了以下方法:

import numpy
import array
from datetime import datetime
test = array.array('d', [0]*12000000)
t = datetime.now()
numpy.array(test)
print datetime.now() - t

我在一到两秒之间得到一个结果:相当于python中的一个循环。

是否有更有效的方式进行此转换?

2 个答案:

答案 0 :(得分:49)

np.array(test)                                       # 1.19s

np.fromiter(test, dtype=int)                         # 1.08s

np.frombuffer(test)                                  # 459ns !!!

答案 1 :(得分:1)

asarray(x)几乎总是任何类似数组的对象的最佳选择。

arrayfromiter较慢,因为它们执行复制。使用asarray可以删除此副本:

>>> import array
>>> import numpy as np
>>> test = array.array('d', [0]*12000000)
# very slow - this makes multiple copies that grow each time
>>> %timeit np.fromiter(test, dtype=test.typecode)
626 ms ± 3.97 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# fast memory copy
>>> %timeit np.array(test)
63.5 ms ± 639 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

# which is equivalent to doing the fast construction followed by a copy
>>> %timeit np.asarray(test).copy()
63.4 ms ± 371 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

# so doing just the construction is way faster
>>> %timeit np.asarray(test)
1.73 µs ± 70.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

# marginally faster, but at the expense of verbosity and type safety if you
# get the wrong type
>>> %timeit np.frombuffer(test, dtype=test.typecode)
1.07 µs ± 27.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)