问题陈述很简单:给定任意数量的NumPy一维浮点矢量,如下所示:
v1 = numpy.array([0, 0, 0.5, 0.5, 1, 1, 1, 1, 0, 0])
v2 = numpy.array([4, 4, 4, 5, 5, 0, 0])
v3 = numpy.array([1.1, 1.1, 1.2])
v4 = numpy.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10])
求和最快的方法是什么?
many_vectors = [v1, v2, v3, v4]
使用直接求和函数将不起作用,因为它们可以具有任意不均匀的长度:
>>> result = sum(many_vectors)
ValueError: operands could not be broadcast together with shapes (10,) (7,)
相反,可以使用pandas
库,该库将提供一个简单的fillna
参数来避免此问题。
>>> pandas.DataFrame(v for v in many_vectors).fillna(0.0).sum().values
array([ 5.1, 5.1, 5.7, 5.5, 6. , 1. , 1. , 1. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 10. ])
但这可能不是最优化的处理方式,因为生产用例将拥有大量数据。
In [9]: %timeit pandas.DataFrame(v for v in many_vectors).fillna(0.0).sum().values
1.16 ms ± 97.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
答案 0 :(得分:2)
方法1
具有如此大的输入数组大小和更多数量的数组,我们需要提高内存效率,因此建议使用一种循环的方法,一次迭代地累加一个数组-
many_vectors = [v1, v2, v3, v4] # list of all vectors
lens = [len(i) for i in many_vectors]
L = max(lens)
out = np.zeros(L)
for l,v in zip(lens,many_vectors):
out[:l] += v
方法2
另一个用masking
进行向量化的向量,从那些不规则形状的向量/数组列表中生成规则的2D
数组,然后沿列求和以得到最终输出-
# Inspired by https://stackoverflow.com/a/38619350/ @Divakar
def stack1Darrs(v):
lens = np.array([len(item) for item in v])
mask = lens[:,None] > np.arange(lens.max())
out_dtype = np.result_type(*[i.dtype for i in v])
out = np.zeros(mask.shape,dtype=out_dtype)
out[mask] = np.concatenate(v)
return out
out = stack1Darrs(many_vectors).sum(0)
答案 1 :(得分:0)
功劳归@Divakar。这个答案只会扩大和完善他的答案。特别是,我重写了功能以匹配我们的样式指南并为其计时。
可能有两种方法:
方法1
###############################################################################
def sum_vectors_with_padding_1(vectors):
"""Given an arbitrary amount of NumPy one-dimensional vectors of floats,
do an element-wise sum, padding with 0 any that are shorter than the
longest array (see https://stackoverflow.com/questions/56166217).
"""
import numpy
all_lengths = [len(i) for i in vectors]
max_length = max(all_lengths)
out = numpy.zeros(max_length)
for l,v in zip(all_lengths, vectors): out[:l] += v
return out
方法2
###############################################################################
def sum_vectors_with_padding_2(vectors):
"""Given an arbitrary amount of NumPy one-dimensional vectors of floats,
do an element-wise sum, padding with 0 any that are shorter than the
longest array (see https://stackoverflow.com/questions/56166217).
"""
import numpy
all_lengths = numpy.array([len(item) for item in vectors])
mask = all_lengths[:,None] > numpy.arange(all_lengths.max())
out_dtype = numpy.result_type(*[i.dtype for i in vectors])
out = numpy.zeros(mask.shape, dtype=out_dtype)
out[mask] = numpy.concatenate(vectors)
return out.sum(axis=0)
定时
>>> v1 = numpy.array([0, 0, 0.5, 0.5, 1, 1, 1, 1, 0, 0])
>>> v2 = numpy.array([4, 4, 4, 5, 5, 0, 0])
>>> v3 = numpy.array([1.1, 1.1, 1.2])
>>> v4 = numpy.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10])
>>> many_vectors = [v1, v2, v3, v4]
>>> %timeit sum_vectors_with_padding_1(many_vectors)
12 µs ± 645 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> %timeit sum_vectors_with_padding_2(many_vectors)
22.6 µs ± 669 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
所以方法1似乎更好!