如何在numpy向量的每个元素之间应用linspace?

时间:2018-12-28 14:57:14

标签: python numpy

我有以下numpy数组:

a = np.array([1,4,2])

我希望通过在a数组中的每个元素之间将其平均除以5来创建一个新数组,以得到:

b = [1., 1.75, 2.5, 3.25, 4., 3.5, 3., 2.5, 2.]

如何在python中有效地做到这一点?

4 个答案:

答案 0 :(得分:2)

您正在寻找一维数组的线性插值,可以使用NumPy.interp完成。

s = 4       # number of intervals between two numbers
l = (a.size - 1) * s + 1          # total length after interpolation
np.interp(np.arange(l), np.arange(l, step=s), a)        # interpolate

# array([1.  , 1.75, 2.5 , 3.25, 4.  , 3.5 , 3.  , 2.5 , 2.  ])

答案 1 :(得分:1)

使用arange的其他选项:

import numpy as np
a = np.array([1,4,2])

res = np.array([float(a[-1])])
  for x, y in zip(a, a[1:]):
  res = np.insert(res, -1, np.array(np.arange(x,y,(y-x)/4)))

print(res)
#=> [1.   1.75 2.5  3.25 4.   3.5  3.   2.5  2.  ]

答案 2 :(得分:1)

我们可以使用vectorized linspace : create_ranges-

# https://stackoverflow.com/a/40624614/ @Divakar
def create_ranges(start, stop, N, endpoint=True):
    if endpoint==1:
        divisor = N-1
    else:
        divisor = N
    steps = (1.0/divisor) * (stop - start)
    return steps[:,None]*np.arange(N) + start[:,None]

def ranges_based(a,N):
    ranges2D = create_ranges(a[:-1],a[1:],N-1,endpoint=False)
    return np.concatenate((ranges2D.ravel(),[a[-1]]))

样品运行-

In [151]: a
Out[151]: array([1, 4, 2])

In [152]: ranges_based(a,N=5)
Out[152]: array([1.  , 1.75, 2.5 , 3.25, 4.  , 3.5 , 3.  , 2.5 , 2.  ])

矢量化解决方案基准化

# @Psidom's soln
def interp_based(a,N=5):
    s = N-1
    l = (a.size - 1) * s + 1    # total length after interpolation
    return np.interp(np.arange(l), np.arange(l, step=s), a)   

间隔为5的大型阵列上的定时-

In [199]: np.random.seed(0)

In [200]: a = np.random.randint(0,10,(10000))

In [201]: %timeit interp_based(a,N=5)
     ...: %timeit ranges_based(a,N=5)
1000 loops, best of 3: 318 µs per loop
1000 loops, best of 3: 227 µs per loop

In [202]: np.random.seed(0)

In [203]: a = np.random.randint(0,10,(100000))

In [204]: %timeit interp_based(a,N=5)
     ...: %timeit ranges_based(a,N=5)
100 loops, best of 3: 3.39 ms per loop
100 loops, best of 3: 2.77 ms per loop

在具有更大50间隔的大型阵列上的定时-

In [205]: np.random.seed(0)

In [206]: a = np.random.randint(0,10,(10000))

In [207]: %timeit interp_based(a,N=50)
     ...: %timeit ranges_based(a,N=50)
100 loops, best of 3: 3.65 ms per loop
100 loops, best of 3: 2.14 ms per loop

In [208]: np.random.seed(0)

In [209]: a = np.random.randint(0,10,(100000))

In [210]: %timeit interp_based(a,N=50)
     ...: %timeit ranges_based(a,N=50)
10 loops, best of 3: 43.4 ms per loop
10 loops, best of 3: 31.1 ms per loop

间隔长度越大,似乎create_ranges的性能提升也就越大。

进一步的改进

我们可以通过在开始时进行串联,然后在末尾进行切片来进一步优化,从而避免在那里的串联,就像这样-

def ranges_based_v2(a,N):
    start = a
    stop = np.concatenate((a[1:],[0]))
    return create_ranges(start, stop, N-1, endpoint=False).ravel()[:-N+2]

在间隔为550的较大数组上的定时-

In [243]: np.random.seed(0)

In [244]: a = np.random.randint(0,10,(100000))

In [245]: %timeit interp_based(a,N=5)
     ...: %timeit ranges_based(a,N=5)
     ...: %timeit ranges_based_v2(a,N=5)
100 loops, best of 3: 3.38 ms per loop
100 loops, best of 3: 2.71 ms per loop
100 loops, best of 3: 2.49 ms per loop

In [246]: %timeit interp_based(a,N=50)
     ...: %timeit ranges_based(a,N=50)
     ...: %timeit ranges_based_v2(a,N=50)
10 loops, best of 3: 42.8 ms per loop
10 loops, best of 3: 30.1 ms per loop
10 loops, best of 3: 22.2 ms per loop

更多numexpr

我们可以将multi-corenumexpr一起使用-

# https://stackoverflow.com/a/40624614/ @Divakar
import numexpr as ne
def create_ranges_numexpr(start, stop, N, endpoint=True):
    if endpoint==1:
        divisor = N-1
    else:
        divisor = N
    s0 = start[:,None]
    s1 = stop[:,None]
    r = np.arange(N)
    return ne.evaluate('((1.0/divisor) * (s1 - s0))*r + s0')

def ranges_based_v3(a,N):
    start = a
    stop = np.concatenate((a[1:],[0]))
    return create_ranges_numexpr(start, stop, N-1, endpoint=False).ravel()[:-N+2]

时间-

In [276]: np.random.seed(0)

In [277]: a = np.random.randint(0,10,(100000))

In [278]: %timeit interp_based(a,N=5)
     ...: %timeit ranges_based(a,N=5)
     ...: %timeit ranges_based_v2(a,N=5)
     ...: %timeit ranges_based_v3(a,N=5)
100 loops, best of 3: 3.39 ms per loop
100 loops, best of 3: 2.75 ms per loop
100 loops, best of 3: 2.49 ms per loop
1000 loops, best of 3: 1.17 ms per loop

In [279]: %timeit interp_based(a,N=50)
     ...: %timeit ranges_based(a,N=50)
     ...: %timeit ranges_based_v2(a,N=50)
     ...: %timeit ranges_based_v3(a,N=50)
10 loops, best of 3: 43.1 ms per loop
10 loops, best of 3: 31.3 ms per loop
10 loops, best of 3: 22.3 ms per loop
100 loops, best of 3: 11.4 ms per loop

答案 3 :(得分:0)

您可以先创建一个起始停止点数组,然后在该数组上映射linspace。

v=np.vstack([a[:-1],a[1:]])
ls = np.apply_along_axis(lambda x: np.linspace(*x,5),1,v)

最后一列包含重复的端点(最后一行除外)。我们可以使用遮罩获得“正确”的元素。

mask = np.ones((len(a)-1,5),dtype='bool')
mask[:-1,-1] = 0

output = ls[mask]

请注意,您还可以使用切片和整形来选择行。

output = np.zeros(5*(len(a)-1)-1)
output[:-1] = np.reshape(ls[:,:-1],-1)
output[-1] = a[-1]