我有一个类似于他的数据框:
BirthYear Sex Area Count
2015 W Dhaka 6
2015 M Dhaka 3
2015 W Khulna 1
2015 M Khulna 8
2014 M Dhaka 13
2014 W Dhaka 20
2014 M Khulna 9
2014 W Khulna 6
2013 W Dhaka 11
2013 M Dhaka 2
2013 W Khulna 8
2013 M Khulna 5
2012 M Dhaka 12
2012 W Dhaka 4
2012 W Khulna 7
2012 M Khulna 1
现在我想在Pandas中创建一个条形图,其中只有男性和女性。将展示2015年出生的女性。 代码:
df = pd.read_csv('out.csv')
df=df.reset_index()
df=df.loc[df["BirthYear"]==2015]
agg_df = df.groupby(['Sex']).sum()
agg_df.reset_index(inplace=True)
piv_df = agg_df.pivot(columns='Sex', values='Count')
piv_df.plot.bar(stacked=True)
plt.show()
执行后,IDLE显示此错误:
Traceback (most recent call last):
File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\indexes\base.py", line 1945, in get_loc
return self._engine.get_loc(key)
File "pandas\index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas\index.c:4066)
File "pandas\index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas\index.c:3930)
File "pandas\hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12408)
File "pandas\hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12359)
KeyError: 'BirthYear'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/sabid/Dropbox/Freelancing/data visualization python/pie.py", line 8, in <module>
df=df.loc[df["StichtagDatJahr"]==2015]
File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\core\frame.py", line 1997, in __getitem__
return self._getitem_column(key)
File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\core\frame.py", line 2004, in _getitem_column
return self._get_item_cache(key)
File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\core\generic.py", line 1350, in _get_item_cache
values = self._data.get(item)
File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\core\internals.py", line 3290, in get
loc = self.items.get_loc(item)
File "C:\Users\sabid\AppData\Local\Programs\Python\Python35\lib\site-packages\pandas\indexes\base.py", line 1947, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas\index.c:4066)
File "pandas\index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas\index.c:3930)
File "pandas\hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12408)
File "pandas\hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12359)
KeyError: 'BirthYear'
我从this link开始知道它发生了,因为'BirthYear'列名称前面有一些标题。 但我不知道如何删除标题并使代码工作。 对此有什么有效的解决方案吗?
答案 0 :(得分:1)
您可以重命名列。
df.rename(columns=["BirthYear", "Sex", "Area", "Count"], inplace=True)
答案 1 :(得分:0)
我假设你想要这样的输出:
我不确定这一点,但我认为使用pivot
方法搞砸了你。您不需要使用pivot
,因为agg_df
基本上是数据透视表。这是我用来创建该图表的代码:
import pandas as pd
# I made this to approximate your CSV file.
table = {
'BirthYear': [2015, 2015, 2015, 2015, 2014, 2014,],
'Sex': ['W', 'M', 'W', 'M', 'M', 'W',],
'Area': ['Dhaka', 'Dhaka', 'Khulna', 'Khulna', 'Dhaka', 'Dhaka',],
'Count': [6, 3, 1, 8, 13, 20]
}
df = pd.DataFrame(table)
df = df.reset_index(drop=True)
# Select people born in 2015.
df = df.loc[df["BirthYear"] == 2015]
# This is basically a pivot table.
agg_df = df.groupby(['Sex']).sum()
# Make the plot.
agg_df['Count'].plot.bar(stacked=True)