我有一个数据框,我在其中得到特定日期的每日汇总。下面是日期 cpu cpu cpu cpu mem mem mem mem load load load load drops drops drops drops latency latency latency latency gw_latency gw_latency gw_latency gw_latency upload upload upload upload download download download download sap_drops sap_drops sap_drops sap_drops sap_latency sap_latency sap_latency sap_latency
mean min max std mean min max std mean min max std mean min max std mean min max std mean min max std mean min max std mean min max std mean min max std mean min max std
date
2018-02-11 4.282442748 0 17 4.361148065 13.61068702 0 27 6.123815451 3.891450382 0 47.62 6.426298507 1.526717557 0 100 12.30842628 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
的数据框,我在那里找到了平均值,最小值,最大值,标准值
2018-02-12
同样,我有另一个日期 cpu cpu cpu cpu mem mem mem mem load load load load drops drops drops drops latency latency latency latency gw_latency gw_latency gw_latency gw_latency upload upload upload upload download download download download sap_drops sap_drops sap_drops sap_drops sap_latency sap_latency sap_latency sap_latency
mean min max std mean min max std mean min max std mean min max std mean min max std mean min max std mean min max std mean min max std mean min max std mean min max std
date
2018-02-12 5.726315789 0 21 2.938315053 22.30526316 0 23 3.581474037 6.06 0 44.75 6.798944285 0.5263157895 0 100 7.254762501 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
的数据框,我找到了它的平均值,最小值,最大值,标准值
import pandas as pd
df = pd.read_csv("metrics.csv", parse_dates=["date"])
df.set_index("date", inplace=True)
df_prev = df.loc['2018-02-11'].resample('D')['cpu', 'mem', 'load', 'drops', 'latency',
'gw_latency', 'upload', 'download', 'sap_drops',
'sap_latency'].agg(['mean', 'min', 'max', 'std']).fillna(0)
df_next = df.loc['2018-02-12'].resample('D')['cpu', 'mem', 'load', 'drops', 'latency',
'gw_latency', 'upload', 'download', 'sap_drops',
'sap_latency'].agg(['mean', 'min', 'max', 'std']).fillna(0)
以下是代码
df_diff = df_next.sub(df_prev, fill_value=0)
print(df_diff)
现在我想减去两个数据帧以获得每个列的值差异。这就是我做的事情
cpu cpu cpu cpu mem mem mem mem load load load load drops drops drops drops latency latency latency latency gw_latency gw_latency gw_latency gw_latency upload upload upload upload download download download download sap_drops sap_drops sap_drops sap_drops sap_latency sap_latency sap_latency sap_latency
mean min max std mean min max std mean min max std mean min max std mean min max std mean min max std mean min max std mean min max std mean min max std mean min max std
date
2018-02-11 -4.282442748 0 -17 -4.361148065 -13.61068702 0 -27 -6.123815451 -3.891450382 0 -47.62 -6.426298507 -1.526717557 0 -100 -12.30842628 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2018-02-12 5.726315789 0 21 2.938315053 22.30526316 0 23 3.581474037 6.06 0 44.75 6.798944285 0.5263157895 0 100 7.254762501 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
但它并没有减去任何东西,我也得到了没有任何意义的日期,因为我只想要统计数据差异。
{{1}}
正如你所看到的,根本没有做任何减法。为什么会发生这种情况呢?
PS我最终想知道这两个日期的统计数据之间的百分比差异。有没有直接的方法呢?
答案 0 :(得分:1)
获得差异
df_next - df_prev.values
要获得%
更改,
(df_next - df_prev.values)/(df_prev.values) * 100