我有以下数据集:
import pandas as pd
w = pd.Series(['EY', 'EY', 'EY', 'KPMG', 'KPMG', 'KPMG', 'BAIN', 'BAIN', 'BAIN'])
x = pd.Series([2020,2019,2018,2020,2019,2018,2020,2019,2018])
y = pd.Series([100000, 500000, 1000000, 50000, 100000, 40000, 1000, 500, 4000])
z = pd.Series([10000, 10000, 20000, 25000, 50000, 10000, 100000, 50500, 120000])
df = pd.DataFrame({'consultant': w, 'fiscal_year':x, 'actual_cost':y, 'budgeted_cost':z})
indexer_consultant_fy = ['consultant', 'fiscal_year']
df = df.set_index(indexer_consultant_fy).sort_index(ascending=True)
df['actual_budget_pct_diff'] = df.pct_change(axis='columns',fill_method='ffill')['budgeted_cost']
如何在不切换数据帧中的列的情况下,在实际的最后一行代码中切换Actual_cost和budgeted_cost?
结果应该是,当actual_cost高于预算成本时, actual_budget_pct_diff 是正数吗?谢谢大家!
答案 0 :(得分:3)
只需指定periods=-1
并按如下所示选择列[actual_cost]
:
df['actual_budget_pct_diff'] = df.pct_change(periods=-1, axis='columns',fill_method='ffill')['actual_cost']
Out[160]:
actual_cost budgeted_cost actual_budget_pct_diff
consultant fiscal_year
BAIN 2018 4000 120000 -0.966667
2019 500 50500 -0.990099
2020 1000 100000 -0.990000
EY 2018 1000000 20000 49.000000
2019 500000 10000 49.000000
2020 100000 10000 9.000000
KPMG 2018 40000 10000 3.000000
2019 100000 50000 1.000000
2020 50000 25000 1.000000
答案 1 :(得分:2)
由于您只想计算2列之间的pct_change,您可以手动进行 ,因为它仍将被矢量化:
df['actual_budget_pct_diff'] = (df.actual_cost-df.budgeted_cost)/df.budgeted_cost
您得到:
actual_cost budgeted_cost actual_budget_pct_diff
consultant fiscal_year
BAIN 2018 4000 120000 -0.966667
2019 500 50500 -0.990099
2020 1000 100000 -0.990000
EY 2018 1000000 20000 49.000000
2019 500000 10000 49.000000
2020 100000 10000 9.000000
KPMG 2018 40000 10000 3.000000
2019 100000 50000 1.000000
2020 50000 25000 1.000000
答案 2 :(得分:2)
您可以轻松地将df.pct_change
函数应用于具有重新排序的列的另一个数据框,而无需更改df
本身的列。
df['actual_budget_pct_diff'] = df[['budgeted_cost', 'actual_cost']].pct_change(axis='columns', fill_method='ffill')['actual_cost']
请注意,df[['budgeted_cost', 'actual_cost']]
是一个新的数据框,它不会影响原始数据框df
的列顺序。因此,df
的顺序仍按要求保留:
actual_cost budgeted_cost actual_budget_pct_diff
consultant fiscal_year
BAIN 2018 4000 120000 -0.966667
2019 500 50500 -0.990099
2020 1000 100000 -0.990000
EY 2018 1000000 20000 49.000000
2019 500000 10000 49.000000
2020 100000 10000 9.000000
KPMG 2018 40000 10000 3.000000
2019 100000 50000 1.000000
2020 50000 25000 1.000000