数据框熊猫的条件(NaN列)和条件

时间:2019-04-01 19:28:16

标签: python pandas numpy

我想计算从数据集中停止服务器的时间长度。 我知道停机时间,但不知道停机时间。

我有这个df:

index                   a          b     c     reboot   stop
2018-06-25 12:49:00    NaN        NaN   NaN     0         1
2018-06-25 12:50:00    NaN        NaN   NaN     0         1
2018-06-25 12:51:00    NaN        NaN   NaN     1         1
2018-06-25 12:52:00    NaN        NaN   NaN     0         1
2018-06-25 12:53:00    NaN        NaN   NaN     0         1
2018-06-25 12:54:00    NaN        NaN   NaN     0         1
2018-06-25 12:55:00    NaN        NaN   NaN     0         1
2018-06-25 12:56:00    NaN        NaN   1.2      0         0
2018-06-25 12:57:00    NaN        NaN   NaN     0         1
2018-06-25 12:58:00    NaN        NaN   NaN     1         1
2018-06-25 12:59:00    NaN        NaN   NaN     0         1
2018-06-25 13:00:00    NaN        NaN   NaN     0         1
2018-06-25 13:01:00    NaN        NaN   NaN     0         0

如果a, b, c = NaNreboot, stop = 1时我的服务器停止了 并从reboot, stop = 0开始。

所需的输出:

index                        period
2018-06-25 12:51:00             5
2018-06-25 12:58:00             3

1 个答案:

答案 0 :(得分:1)

这将完成您想要的:

# Create a new column which identifies stopped times
df['stopped'] = np.nan
idx_stopped = (pd.isnull(df.a)) & (pd.isnull(df.b)) & (pd.isnull(df.c)) & (df.reboot == 1) & (df.stop == 1)
df.loc[idx_stopped, 'stopped'] = 1
df.loc[(df.reboot == 0) & (df.stop == 0), 'stopped'] = 0
df.stopped = df.stopped.ffill()
df.stopped = df.stopped.fillna(0)
df.loc[df.stopped == 0, 'stopped'] = np.nan

# Count the number of periods for each stop instance
v = df.stopped[::-1]
cumsum = v.cumsum().fillna(method='pad')
reset = -cumsum[v.isnull()].diff().fillna(cumsum)
result = v.where(v.notnull(), reset).cumsum()
df['period'] = result[::-1]

# Identify the time each stop incident began
df['first'] = (df.stopped == 1) & (pd.isnull(df.stopped.shift(1)))
df2 = df[['index', 'period']][df['first']]
df2.period = df2.period.astype(int)

print(df2)
                 index  period
2  2018-06-25 12:51:00       5
9  2018-06-25 12:58:00       3