我正在尝试为此数据集中的变量之一创建虚拟变量,但是正在发生错误,我不知道如何解决它,有任何线索?
代码:
df = pd.read_excel(open('DID dataset.xlsx', 'rb'), sheet_name = 'All2')
Location_dummy = pd.get_dummies(df['Location'], drop_first=True)
数据: https://gyazo.com/79af7378c4e06c0f36f7f43d03a65119
错误:
Location_dummy = pd.get_dummies(df['Location'], drop_first=True)
Traceback (most recent call last):
File "<ipython-input-5-f9cbe04c43a1>", line 1, in <module>
Location_dummy = pd.get_dummies(df['Location'], drop_first=True)
File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2685, in __getitem__
return self._getitem_column(key)
File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2692, in _getitem_column
return self._get_item_cache(key)
File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\generic.py", line 2486, in _get_item_cache
values = self._data.get(item)
File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\internals.py", line 4115, in get
loc = self.items.get_loc(item)
File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 3065, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Location'
当我输入
时,会发生相同的错误df['Location']
此特殊列的excel数据集是否存在问题,因为我能够获取其他变量的虚拟变量?还是其他原因?
答案 0 :(得分:0)
您的代码完全可以,但是问题可能出在列名上,也可能不是,您的列名必须有一些前导或尾随空格。 因此,使用以下方法进行检查:
print("Column headings:")
print(df.columns)
因此,您可以检查df['Location ']
或df[' location']
以获得列数据,并相应地更改get_dummies的代码。