无法获取熊猫数据框中特定变量的虚拟变量

时间:2019-02-10 19:02:15

标签: python pandas

我正在尝试为此数据集中的变量之一创建虚拟变量,但是正在发生错误,我不知道如何解决它,有任何线索?

代码:

df = pd.read_excel(open('DID dataset.xlsx', 'rb'), sheet_name = 'All2')
Location_dummy = pd.get_dummies(df['Location'], drop_first=True)

数据: https://gyazo.com/79af7378c4e06c0f36f7f43d03a65119

错误:

Location_dummy = pd.get_dummies(df['Location'], drop_first=True)
Traceback (most recent call last):

File "<ipython-input-5-f9cbe04c43a1>", line 1, in <module>
Location_dummy = pd.get_dummies(df['Location'], drop_first=True)

File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2685, in __getitem__
return self._getitem_column(key)

File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2692, in _getitem_column
return self._get_item_cache(key)

File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\generic.py", line 2486, in _get_item_cache
values = self._data.get(item)

File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\internals.py", line 4115, in get
loc = self.items.get_loc(item)

File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 3065, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))

File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc

File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc

File "pandas\_libs\hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item

File "pandas\_libs\hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item

KeyError: 'Location'

当我输入

时,会发生相同的错误
df['Location']

此特殊列的excel数据集是否存在问题,因为我能够获取其他变量的虚拟变量?还是其他原因?

1 个答案:

答案 0 :(得分:0)

您的代码完全可以,但是问题可能出在列名上,也可能不是,您的列名必须有一些前导或尾随空格。 因此,使用以下方法进行检查:

print("Column headings:")
print(df.columns)

因此,您可以检查df['Location ']df[' location']以获得列数据,并相应地更改get_dummies的代码。