我想模拟一个包含以下内容的系统:
例如,f采样5ms,g采样10ms,序列可能如下:
0ms: f samples, g samples
5ms: f samples
10ms: f samples, g samples
12ms: event arrives
13ms: event arrives
15ms: f samples
20ms: f samples, g samples
25ms f samples
27ms: event arrives
30ms f samples, g samples
35ms f samples
37ms: event arrives
对于每次排放,最接近事件时间(但在事件时间之后)的消费者“获胜”。如果出现平局,应该随机选择获胜者。例如,f的15ms样本同时赢得了12ms和13ms的事件。
我试图通过将时间轴合并到一个索引来实现这一点:
import numpy as np
import pandas as pd
f = np.arange(0, 40, 5)
g = np.arange(0, 40, 10)
events = [12, 13, 27, 37]
df = pd.concat([pd.Series(f, f), pd.Series(g, g), pd.Series(events, events)], axis=1)
产生这样的DataFrame:
f g events
0 0 0 NaN
5 5 NaN NaN
10 10 10 NaN
12 NaN NaN 12
13 NaN NaN 13
15 15 NaN NaN
20 20 20 NaN
25 25 NaN NaN
27 NaN NaN 27
30 30 30 NaN
35 35 NaN NaN
37 NaN NaN 37
我一直在努力寻找获胜者,并针对以下汇总进行各种操作:
In [103]: pd.expanding_max(df)
f g events
0 0 0 NaN
5 5 0 NaN
10 10 10 NaN
12 10 10 12
13 10 10 13
15 15 10 13
20 20 20 13
25 25 20 13
27 25 20 27
30 30 30 27
35 35 30 27
37 35 30 37
...但一直很难找到一只大熊猫的方法。
我觉得非常接近以下内容:
In [141]: x = pd.expanding_min(df.sort(ascending=False))
gx = x.groupby('events')
print gx.max()
events
12 15 20
13 15 20
27 30 30
37 35 30
有什么想法吗?
答案 0 :(得分:2)
使用bfill
在“f”和“f”中向后填充NaN “g”栏:
import numpy as np
import pandas as pd
f = np.arange(0, 40, 5)
g = np.arange(0, 40, 10)
events = [12, 13, 27, 37]
df = pd.concat([pd.Series(f, f), pd.Series(g, g), pd.Series(events, events)], axis=1)
df.columns = "f", "g", "event"
df[["f", "g"]] = df[["f", "g"]].bfill()
df2 = df.dropna()
print df2
这是输出:
f g event
12 15 20 12
13 15 20 13
27 30 30 27
然后我们可以比较f& G:
print np.sign(df2.f - df2.g).replace({-1:"f", 1:"g", 0:"fg"})
输出是:
12 f
13 f
27 fg
dtype: object
这意味着在12& 13表示“f”,27表示事件应随机选择。