选择熊猫数据框中具有整数标题的列

时间:2019-01-23 18:36:13

标签: python python-3.x pandas dataframe

我在熊猫中有一个数据框,如下所示:

   100  200  300  400
0    1    1    0    1
1    1    1    1    0

我要做的是从此数据框中选择特定的列。但是,当我尝试以下代码时(df_matrix是顶部显示的数据框):

intermediary_df = df_matrix["100"]

它不起作用,据我所知是因为它是整数。我尝试用str(100)强制执行此操作,但给出了与以前相同的错误:

File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
TypeError: an integer is required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "A:\python project\venv\lib\site-packages\pandas\core\indexes\base.py", line 3078, in get_loc
    return self._engine.get_loc(key)
  File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 164, in pandas._libs.index.IndexEngine.get_loc
KeyError: '100'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
TypeError: an integer is required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "A:/python project/testing/testing4.py", line 42, in <module>
    intermediary_df = df_matrix["100"]
  File "A:\python project\venv\lib\site-packages\pandas\core\frame.py", line 2688, in __getitem__
    return self._getitem_column(key)
  File "A:\python project\venv\lib\site-packages\pandas\core\frame.py", line 2695, in _getitem_column
    return self._get_item_cache(key)
  File "A:\python project\venv\lib\site-packages\pandas\core\generic.py", line 2489, in _get_item_cache
    values = self._data.get(item)
  File "A:\python project\venv\lib\site-packages\pandas\core\internals.py", line 4115, in get
    loc = self.items.get_loc(item)
  File "A:\python project\venv\lib\site-packages\pandas\core\indexes\base.py", line 3080, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 164, in pandas._libs.index.IndexEngine.get_loc
KeyError: '100'

有人知道如何解决这个问题吗?谢谢!

编辑1:

尝试使用intermediary_df = df_matrix[100]后,它按预期工作。顺便说一句,如果其他人遇到此问题并想同时选择多个列,则可以使用:

intermediary_df = df_matrix[[100, 300]]

,输出将是:

   100  300
0    1    0
1    1    1

2 个答案:

答案 0 :(得分:1)

我认为您的列类型是整数, 但是如果没有尝试使用DataFrame.loc

示例:

intermediary_df = df_matrix.loc[:,100]

intermediary_df = df_matrix.iloc[:,0]

答案 1 :(得分:1)

由于您的列是int,因此在这种情况下只需使用以下内容。

intermediary_df = df_matrix[100]`

如果您希望以str的方式访问列,请使用:

df.columns = [str(x) for x in df.columns]

然后

df['100']

输出

0    1
1    1
Name: 100, dtype: int64