什么是获得nr行直到numpy中的下一个信号值的有效方法?
我有一个信号值列表(-1,nan,1),它看起来类似于下表,并且希望得到另一个列表,其中nr为行值直到下一个信号。考虑到消极和积极的价值观。
鉴于此表的第二列signal
,我想生成第三列backward
:
+-------+--------+----------+
| index | signal | backward |
+-------+--------+----------+
| 0 | | |
| 1 | | |
| 2 | | |
| 3 | 1 | 4 |
| 4 | | 3 |
| 5 | | 2 |
| 6 | | 1 |
| 7 | -1 | -3 |
| 8 | | -2 |
| 9 | | -1 |
| 10 | 1 | 3 |
| 11 | | 2 |
| 12 | | 1 |
| 13 | 1 | 5 |
| 14 | | 4 |
| 15 | | 3 |
| 16 | | 2 |
| 17 | | 1 |
| 18 | -1 | -3 |
| 19 | | -2 |
| 20 | | -1 |
| 21 | -1 | -5 |
| 22 | | -4 |
| 23 | | -3 |
| 24 | | -2 |
| 25 | | -1 |
| 26 | 1 | 4 |
| 27 | | 3 |
| 28 | | 2 |
| 29 | | 1 |
+-------+--------+----------+
原始numpy的形状看起来像这样。请原谅我创建这个随机列表的方式,我不知道更好的方法:)这只是为了演示目的
import numpy as np
data = np.random.randint(-4, 4, (1000,)).astype(float)
data[data == -2] = 'nan'
data[data == -3] = 'nan'
data[data == -4] = 'nan'
data[data == 0] = 'nan'
data[data == 2] = 'nan'
data[data == 3] = 'nan'
print(data)
它的大小是几百万,所以它必须尽可能高效
答案 0 :(得分:3)
这是一种基于累积求和的方法 -
def seq_descending(a):
mask = ~np.isnan(a)
idx = np.flatnonzero(mask)
shift_idx = np.hstack((idx[1:] - idx[:-1], a.size - idx[-1] ))
out = -np.ones(a.size, dtype=int)
out[idx] = shift_idx-1
idx0 = idx[0]
out[:idx0] = 0
out[idx0] += 1
cumsums = out.cumsum()
signs = np.repeat(a[idx].astype(int), shift_idx)
cumsums[idx0:] *= signs
return cumsums
示例运行 -
1)设置输入数组:
In [82]: a = np.full((30,), np.nan)
...: a[[3,7,10,13,18,21,26]] = [1,-1,1,1,-1,-1,1]
...:
2)根据输入获取输出数组和堆栈以进行比较:
In [83]: np.column_stack((a, seq_descending(a) ))
Out[83]:
array([[ nan, 0.],
[ nan, 0.],
[ nan, 0.],
[ 1., 4.],
[ nan, 3.],
[ nan, 2.],
[ nan, 1.],
[ -1., -3.],
[ nan, -2.],
[ nan, -1.],
[ 1., 3.],
[ nan, 2.],
[ nan, 1.],
[ 1., 5.],
[ nan, 4.],
[ nan, 3.],
[ nan, 2.],
[ nan, 1.],
[ -1., -3.],
[ nan, -2.],
[ nan, -1.],
[ -1., -5.],
[ nan, -4.],
[ nan, -3.],
[ nan, -2.],
[ nan, -1.],
[ 1., 4.],
[ nan, 3.],
[ nan, 2.],
[ nan, 1.]])
答案 1 :(得分:1)
数据:强>
array([ nan, -1., nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, 1., -1., nan, -1., -1., nan, nan, -1.])
您可以使用pandas
:
df = pd.DataFrame({'id':np.square(np.nan_to_num(data)).cumsum(),'signal':data})
df['backward'] = df.groupby('id')['id'].transform(lambda x: np.arange(1, len(x)+1)[::-1])
df['backward'] = df['backward']*df.signal.fillna(method='ffill')
>>> df
id signal backward
0 0 NaN NaN
1 1 -1 -11
2 1 NaN -10
3 1 NaN -9
4 1 NaN -8
5 1 NaN -7
6 1 NaN -6
7 1 NaN -5
8 1 NaN -4
9 1 NaN -3
10 1 NaN -2
11 1 NaN -1
12 2 1 1
13 3 -1 -2
14 3 NaN -1
15 4 -1 -1
16 5 -1 -3
17 5 NaN -2
18 5 NaN -1
19 6 -1 -1