Question

我正试图散布一个数据框，为此，我为它提供了x和y分量。它在x组件中显示错误。它在“年份”列上给出了错误。我已经手动检查了数据栏中是否存在Year Column，但它仍然显示错误。请注意，年份列包含从1960年到1964年的年份。

urb_pop_reader = pd.read_csv('ind_pop_data.csv', chunksize=1000)
df_urb_pop = next(urb_pop_reader)
df_pop_ceb = df_urb_pop[df_urb_pop['CountryCode'] == 'CEB']
pops = zip(df_pop_ceb['Total Population'], 
           df_pop_ceb['Urban population (% of total)'])
pops_list = list(pops)

# Use list comprehension to create new DataFrame column 'Total Urban Population'
df_pop_ceb['Total Urban Population'] = [int(a[0]*(a[1]*0.01)) for a in pops_list]

# Plot urban population data
df_pop_ceb.plot(kind='scatter', x=df_pop_ceb['Year'], y=df_pop_ceb['Total Urban Population'])
plt.show()

Answer 1

由于您正尝试将plt方法直接应用于数据框而引发错误。试试：

import matplotlib as plt
plt.scatter(x=df_pop_ceb['Year'], y=df_pop_ceb['Total Urban Population'])
plt.title('Title')
plt.xlabel('x')
plt.ylabel('y')
plt.show()

此外，也无需压缩即可计算城市总人口。您可以直接将两列相乘：

df_pop_ceb['Total Urban Population'] = (df_pop_ceb['Total Population']*df_pop_ceb['Urban population (% of total)']*0.01)

希望有帮助！

Answer 2

如果要使用熊猫的绘图，则应将标签传递为x和y，而不是数据：

df_pop_ceb.plot(kind='scatter', x='Year', y='Total Urban Population')

还看着the docs，我想你应该这样做

df_pop_ceb.plot.scatter(x='Year', y='Total Urban Population')

KeyError：“ [Int64Index（[1960，1961，1962，1963，1964]，dtype ='int64'）]都不在[列]中”

2 个答案: