Question

如何选择第一列以外的列？

import pandas as pd
df = pd.read_csv('bio.csv')
df

我可以选择第一列，即“索引”

df['Index']

但是，我无法选择第二列，即“高度”。

df['Height']

这是跟踪：

KeyError                                  Traceback (most recent call last)
C:\util\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2441             try:
-> 2442                 return self._engine.get_loc(key)
   2443             except KeyError:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'Height'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-8-58aff8413556> in <module>()
----> 1 df['Height']

C:\util\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   1962             return self._getitem_multilevel(key)
   1963         else:
-> 1964             return self._getitem_column(key)
   1965 
   1966     def _getitem_column(self, key):

C:\util\Anaconda3\lib\site-packages\pandas\core\frame.py in _getitem_column(self, key)
   1969         # get column
   1970         if self.columns.is_unique:
-> 1971             return self._get_item_cache(key)
   1972 
   1973         # duplicate columns & possible reduce dimensionality

C:\util\Anaconda3\lib\site-packages\pandas\core\generic.py in _get_item_cache(self, item)
   1643         res = cache.get(item)
   1644         if res is None:
-> 1645             values = self._data.get(item)
   1646             res = self._box_item_values(item, values)
   1647             cache[item] = res

C:\util\Anaconda3\lib\site-packages\pandas\core\internals.py in get(self, item, fastpath)
   3588 
   3589             if not isnull(item):
-> 3590                 loc = self.items.get_loc(item)
   3591             else:
   3592                 indexer = np.arange(len(self.items))[isnull(self.items)]

C:\util\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2442                 return self._engine.get_loc(key)
   2443             except KeyError:
-> 2444                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2445 
   2446         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'Height'

Answer 1

以下是完整的答案

import pandas as pd
df = pd.read_csv('bio.csv', sep='[ \t]*,[ \t]*', engine='python')
df['Height']

Theis是输出：

Out[22]: 0      65.78
         1      71.52
         2      69.40
         3      68.22
         4      67.79
         5      68.70
         6      69.80
         7      70.01
         8      67.90
         9      66.78
         10     66.49
         11     67.62
         12     68.30
         13     67.12
         14     68.28
         15     71.09

如何从pandas数据框中选择第二列或矩阵？

1 个答案: