数据没有任何变化,即使恢复到以前的版本,我的代码也突然停止工作。我不确定为什么,也无法弄清楚。我已经比较了以前的版本,希望您能找到我所缺少的东西,或推荐对此新KeyError的修复。
就像我说过的那样,我查看了数据,但没有任何变化,因此我猜想问题出在我的代码之内,但是即使恢复到早期版本(以前有效)也不起作用。
在此之前,我有两个字典:state_code(将每个状态映射到区域代码)和状态(将每个状态映射到其缩写)
excel_file = r'/Users/amandawhiting/Desktop/PA_spending_excel.xlsx'
df = pd.read_excel(excel_file)
df = df.rename(columns={'DAMAGE_CATEGORY_CODE': 'damageCode',
'FEDERAL_SHARE_OBLIGATED':'FedShareObligated', 'PROJECT_AMOUNT': 'ProjectAmount'})
df = df[df['FedShareObligated']>= 0]
df = df[df['ProjectAmount'] >= 0] # Removes missing/null projects
df = df[df['damageCode'] != 'A - Debris Removal']
df = df[df['damageCode'] != 'B - Protective Measures']
df = df[df['damageCode'] != 'Z - State Management']
df = df[df['damageCode'] != 'H - Fire Management']
df = df.drop_duplicates()
df = df.reset_index(drop=True)
df2 = pd.read_csv("/Users/amandawhiting/Desktop/DisasterDeclarationsSummaries.csv", usecols = ['disasterNumber', 'fyDeclared', 'state'])
df2 = df2[df2['fyDeclared'] > 1991]
df2 = df2[df2['fyDeclared'] < 2017]
df2 = df2.reset_index(drop=True) # Resets index
df2['disasterNumber'] = df2['disasterNumber'].astype(int)
fulldf = pd.merge(df, df2, left_on = 'disasterNumber', right_on = 'disasterNumber', how = 'inner',)
fulldf = fulldf.drop_duplicates()
fulldf = fulldf.reset_index(drop=True)
fulldf["Region"] = fulldf['state'].map(state_code)
df_state = fulldf.copy()
df_fyr = fulldf.copy()
df_region = fulldf.copy()
df_damageCat = fulldf.copy()
fulldf["TotalProjectCost, 1.3%"] = round(fulldf["ProjectAmount"] * .013)
fulldf["TotalProjectCost, 1.6%"] = round(fulldf["ProjectAmount"] * .016)
fulldf["TotalProjectCost, 15%"] = round(fulldf["ProjectAmount"] * .15)
fulldf["TotalProjectCost, 46%"] = round(fulldf["ProjectAmount"] * .46)
以下几行具体不会产生先前的成功结果:
df_state=df_state[["ProjectAmount"]].groupby('state').sum()
df_state['TotalProjectCost'] = ['${:,.2f}MM'.format(x) for x in df_state['ProjectAmount'] * (.15) / 1000000]
display(df_state)
预期结果: 该表的每个状态都映射到“项目金额”的值,还映射到以百万计的格式设置的项目金额。
实际结果:关键错误
KeyError Traceback (most recent call last)
<ipython-input-31-6b223454133a> in <module>
----> 1 df_state=df_state[["ProjectAmount"]].groupby('state').sum()
2
3 df_state['TotalProjectCost'] = ['${:,.2f}MM'.format(x) for x in df_state['ProjectAmount'] * (.15) / 1000000]
4
5
/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py in groupby(self, by, axis, level, as_index, sort, group_keys, squeeze, observed, **kwargs)
7630 return groupby(self, by=by, axis=axis, level=level, as_index=as_index,
7631 sort=sort, group_keys=group_keys, squeeze=squeeze,
-> 7632 observed=observed, **kwargs)
7633
7634 def asfreq(self, freq, method=None, how=None, normalize=False,
/anaconda3/lib/python3.7/site-packages/pandas/core/groupby/groupby.py in groupby(obj, by, **kwds)
2108 raise TypeError('invalid type: {}'.format(obj))
2109
-> 2110 return klass(obj, by, **kwds)
/anaconda3/lib/python3.7/site-packages/pandas/core/groupby/groupby.py in __init__(self, obj, keys, axis, level, grouper, exclusions, selection, as_index, sort, group_keys, squeeze, observed, **kwargs)
358 sort=sort,
359 observed=observed,
--> 360 mutated=self.mutated)
361
362 self.obj = obj
/anaconda3/lib/python3.7/site-packages/pandas/core/groupby/grouper.py in _get_grouper(obj, key, axis, level, sort, observed, mutated, validate)
576 in_axis, name, level, gpr = False, None, gpr, None
577 else:
--> 578 raise KeyError(gpr)
579 elif isinstance(gpr, Grouper) and gpr.key is not None:
580 # Add key to exclusions
KeyError: 'state'