Question

我有一个奇怪的问题。我不知道它是否正确。我在python3.6

中发现了这个问题

点击数据集的link

df = pd.read_csv("./data/gapminder.tsv",sep="\t")

以下代码不会产生任何错误

subset = df[['country', 'pop']]
subset.head()

但如果我尝试根据索引进行分组，我就会收到错误

subset = df[[0,4]]
> KeyError: '[0 4] not in index'

请在link

中找到ipython错误的详细信息

Answer 1

需要iloc：

url = 'https://raw.githubusercontent.com/jennybc/gapminder/master/inst/gapminder.tsv'
df = pd.read_csv(url, sep="\t")
print (df.head())
       country continent  year  lifeExp       pop   gdpPercap
0  Afghanistan      Asia  1952   28.801   8425333  779.445314
1  Afghanistan      Asia  1957   30.332   9240934  820.853030
2  Afghanistan      Asia  1962   31.997  10267083  853.100710
3  Afghanistan      Asia  1967   34.020  11537966  836.197138
4  Afghanistan      Asia  1972   36.088  13079460  739.981106

subset = df[['country', 'pop']]
print (subset.head())
       country       pop
0  Afghanistan   8425333
1  Afghanistan   9240934
2  Afghanistan  10267083
3  Afghanistan  11537966
4  Afghanistan  13079460

subset = df.iloc[:, [0,4]]
print (subset.head())
       country       pop
0  Afghanistan   8425333
1  Afghanistan   9240934
2  Afghanistan  10267083
3  Afghanistan  11537966
4  Afghanistan  13079460

Pandas子集似乎不能在Python3.6中使用Index工作

1 个答案: