我有一大组(数千条)平滑线(x,y对系列),x和y的采样不同,每条线的长度不同,即
x_0 = {x_00, x_01, ..., } # length n_0
x_1 = {x_10, x_11, ..., } # length n_1
...
x_m = {x_m0, x_m1, ..., } # length n_m
y_0 = {y_00, y_01, ..., } # length n_0
y_1 = {y_10, y_11, ..., } # length n_1
...
y_m = {y_m0, y_m1, ..., } # length n_m
我想找到插入到一组常规x点的每一行的累积属性,即x = {x_0, x_1 ..., x_n-1}
目前我for
- 循环每一行,创建插值,重新采样,然后取总和/中位数/任何结果。它有效,但它真的很慢。 有没有办法对此操作进行矢量化/制作?
我在想,因为线性插值可以是矩阵运算,也许它是可能的。同时,由于每行可以有不同的长度...它可能很复杂。 编辑:但是对较短的数组进行零填充很容易......
我现在正在做的事情看起来像是,
import numpy as np
import scipy as sp
import scipy.interpolate
...
# `xx` and `yy` are lists of lists with the x and y points respectively
# `xref` are the reference x values at which I want interpolants
yref = np.zeros([len(xx), len(xref)])
for ii, (xi, yi) in enumerate(zip(xx, yy)):
yref[ii] = sp.interp(xref, xi, yi)
y_med = np.median(yref, axis=-1)
y_sum = np.sum(yref, axis=-1)
...
答案 0 :(得分:1)
希望您可以根据自己的需要调整以下内容。
我包含了pandas,因为它有一个插值功能来填充缺失值。
import pandas as pd
import numpy as np
x = np.arange(19)
x_0 = x[::2]
x_1 = x[::3]
np.random.seed([3,1415])
y_0 = x_0 + np.random.randn(len(x_0)) * 2
y_1 = x_1 + np.random.randn(len(x_1)) * 2
xy_0 = pd.DataFrame(y_0, index=x_0)
xy_1 = pd.DataFrame(y_1, index=x_1)
注意:
x
长度为19 x_0
的长度为10 x_1
长度为7 xy_0
看起来像:
0
0 -4.259448
2 -0.536932
4 0.059001
6 1.481890
8 7.301427
10 9.946090
12 12.632472
14 14.697564
16 17.430729
18 19.541526
xy_0
可以通过x
reindex
对齐
xy_0.reindex(x)
0
0 -4.259448
1 NaN
2 -0.536932
3 NaN
4 0.059001
5 NaN
6 1.481890
7 NaN
8 7.301427
9 NaN
10 9.946090
11 NaN
12 12.632472
13 NaN
14 14.697564
15 NaN
16 17.430729
17 NaN
18 19.541526
然后我们可以用interpolate
xy_0.reindex(x).interpolate()
0
0 -4.259448
1 -2.398190
2 -0.536932
3 -0.238966
4 0.059001
5 0.770445
6 1.481890
7 4.391659
8 7.301427
9 8.623759
10 9.946090
11 11.289281
12 12.632472
13 13.665018
14 14.697564
15 16.064147
16 17.430729
17 18.486128
18 19.541526
xy_1
xy_1.reindex(x)
0
0 -1.216416
1 NaN
2 NaN
3 3.704781
4 NaN
5 NaN
6 5.294958
7 NaN
8 NaN
9 8.168262
10 NaN
11 NaN
12 10.176849
13 NaN
14 NaN
15 14.714924
16 NaN
17 NaN
18 19.493678
插值
xy_0.reindex(x).interpolate()
0
0 -1.216416
1 0.423983
2 2.064382
3 3.704781
4 4.234840
5 4.764899
6 5.294958
7 6.252726
8 7.210494
9 8.168262
10 8.837791
11 9.507320
12 10.176849
13 11.689541
14 13.202233
15 14.714924
16 16.307842
17 17.900760
18 19.493678