我有一个DataFrame,包含包含带时间戳索引的测量值的数据包。指示测量部分的开始和结束的标志分组散布在消息中。以下是一个例子:
dev node meas 0 meas 1 ...
tstp
2016-04-12 03:42:16.238 instr None [val] [val]
2016-04-12 03:42:16.338 cntrl 101 [val] [val]
2016-04-12 03:42:16.442 instr None [val] [val]
2016-04-12 03:42:16.445 instr None [val] [val]
2016-04-12 03:42:16.445 cntrl 101 [val] [val]
2016-04-12 03:42:16.448 instr None [val] [val]
2016-04-12 03:42:16.540 instr None [val] [val]
2016-04-12 03:42:16.600 cntrl 101 [val] [val]
2016-04-12 03:42:16.639 instr None [val] [val]
2016-04-12 03:42:16.741 instr None [val] [val]
2016-04-12 03:42:17.238 instr None [val] [val]
2016-04-12 03:42:17.338 cntrl 102 [val] [val]
2016-04-12 03:42:17.442 instr None [val] [val]
2016-04-12 03:42:17.445 instr None [val] [val]
2016-04-12 03:42:17.445 cntrl 102 [val] [val]
2016-04-12 03:42:17.448 instr None [val] [val]
2016-04-12 03:42:17.540 instr None [val] [val]
2016-04-12 03:42:17.600 cntrl 102 [val] [val]
2016-04-12 03:42:17.639 instr None [val] [val]
2016-04-12 03:42:17.741 instr None [val] [val]
我要做的是:
for name, group in pkts.groupby('node') :
beg = group.index[0]
end = group.index[-1]
# pseudocode
pkts[ beg:end & pkts.dev=='instr' , 'node' ] = name
直接切片beg:end不起作用,因为非唯一值。任何人都可以提供一些见解或更好的方法吗?
更新(澄清):
目的:根据节点编号轻松访问“instr”设备的测量值。 “instr”设备无法传输节点值。
期望的输出(最初预期,对建议开放):
dev node meas 0 meas 1 ...
tstp
2016-04-12 03:42:16.238 instr None [val] [val]
2016-04-12 03:42:16.338 cntrl 101 [val] [val]
2016-04-12 03:42:16.442 instr 101 [val] [val]
2016-04-12 03:42:16.445 instr 101 [val] [val]
2016-04-12 03:42:16.445 cntrl 101 [val] [val]
2016-04-12 03:42:16.448 instr 101 [val] [val]
2016-04-12 03:42:16.540 instr 101 [val] [val]
2016-04-12 03:42:16.600 cntrl 101 [val] [val]
2016-04-12 03:42:16.639 instr None [val] [val]
2016-04-12 03:42:16.741 instr None [val] [val]
2016-04-12 03:42:17.238 instr None [val] [val]
2016-04-12 03:42:17.338 cntrl 102 [val] [val]
2016-04-12 03:42:17.442 instr 102 [val] [val]
2016-04-12 03:42:17.445 instr 102 [val] [val]
2016-04-12 03:42:17.445 cntrl 102 [val] [val]
2016-04-12 03:42:17.448 instr 102 [val] [val]
2016-04-12 03:42:17.540 instr 102 [val] [val]
2016-04-12 03:42:17.600 cntrl 102 [val] [val]
2016-04-12 03:42:17.639 instr None [val] [val]
2016-04-12 03:42:17.741 instr None [val] [val]
答案 0 :(得分:1)
我认为您可以Multiindex
从reset_index
和set_index
,然后replace
index
创建None
到NaN
,将fillna
与方法ffill
和bfill
:
pkts = pkts.reset_index().set_index('tstp', append=True)
print pkts
dev node meas 0 meas 1
tstp
0 2016-04-12 03:42:16.238 instr None [val] [val]
1 2016-04-12 03:42:16.338 cntrl 101 [val] [val]
2 2016-04-12 03:42:16.442 instr None [val] [val]
3 2016-04-12 03:42:16.445 instr None [val] [val]
4 2016-04-12 03:42:16.445 cntrl 101 [val] [val]
5 2016-04-12 03:42:16.448 instr None [val] [val]
6 2016-04-12 03:42:16.540 instr None [val] [val]
7 2016-04-12 03:42:16.600 cntrl 101 [val] [val]
8 2016-04-12 03:42:16.639 instr None [val] [val]
9 2016-04-12 03:42:16.741 instr None [val] [val]
10 2016-04-12 03:42:16.238 instr None [val] [val]
11 2016-04-12 03:42:16.338 cntrl 102 [val] [val]
12 2016-04-12 03:42:16.442 instr None [val] [val]
13 2016-04-12 03:42:16.445 instr None [val] [val]
14 2016-04-12 03:42:16.445 cntrl 102 [val] [val]
15 2016-04-12 03:42:16.448 instr None [val] [val]
16 2016-04-12 03:42:16.540 instr None [val] [val]
17 2016-04-12 03:42:16.600 cntrl 102 [val] [val]
18 2016-04-12 03:42:16.639 instr None [val] [val]
19 2016-04-12 03:42:16.741 instr None [val] [val]
pkts['node'] = pkts['node'].replace('None',np.nan)
for name, group in pkts.groupby('node'):
beg = group.index[0]
end = group.index[-1]
# print beg
# print end
pkts.loc[ beg:end,'node' ] = pkts.loc[ beg:end,'node' ].fillna(method='ffill')
.fillna(method='bfill')
print pkts
dev node meas 0 meas 1
tstp
0 2016-04-12 03:42:16.238 instr NaN [val] [val]
1 2016-04-12 03:42:16.338 cntrl 101 [val] [val]
2 2016-04-12 03:42:16.442 instr 101 [val] [val]
3 2016-04-12 03:42:16.445 instr 101 [val] [val]
4 2016-04-12 03:42:16.445 cntrl 101 [val] [val]
5 2016-04-12 03:42:16.448 instr 101 [val] [val]
6 2016-04-12 03:42:16.540 instr 101 [val] [val]
7 2016-04-12 03:42:16.600 cntrl 101 [val] [val]
8 2016-04-12 03:42:16.639 instr NaN [val] [val]
9 2016-04-12 03:42:16.741 instr NaN [val] [val]
10 2016-04-12 03:42:16.238 instr NaN [val] [val]
11 2016-04-12 03:42:16.338 cntrl 102 [val] [val]
12 2016-04-12 03:42:16.442 instr 102 [val] [val]
13 2016-04-12 03:42:16.445 instr 102 [val] [val]
14 2016-04-12 03:42:16.445 cntrl 102 [val] [val]
15 2016-04-12 03:42:16.448 instr 102 [val] [val]
16 2016-04-12 03:42:16.540 instr 102 [val] [val]
17 2016-04-12 03:42:16.600 cntrl 102 [val] [val]
18 2016-04-12 03:42:16.639 instr NaN [val] [val]
19 2016-04-12 03:42:16.741 instr NaN [val] [val]