我现在正在进行基本的数据分析,而且我在努力尝试在3个数据集时创建列图。
这是我的数据:
datasetArgentina = {'Year': ["2000", "2001", "2002", "2003", "2004", "2005", "2006", "2007","2008", "2009", "2010", "2011", "2012", "2013", "2014", "2015","2016"], 'Mortality': ['11000', '10000' ,'10000' ,'10000' ,'10000' ,'9300' ,'8900' ,'8700', '9000' , '8600' ,'8300' ,'8100','7800' ,'8000', '7500', '7500', '7300']}
datasetColumbia = {'Year': ["2000", "2001", "2002", "2003", "2004", "2005", "2006", "2007","2008", "2009", "2010", "2011", "2012", "2013", "2014", "2015","2016"], 'Mortality': ['1500 ','1600', '1500' ,'1600' ,'1500', '1200' ,'1300', '1400' ,'1400', '1500' ,'1500' ,'1500' ,'1600' ,'1500', '1500', '1400', '1400']}
datasetBrazil = {'Year': ["2000", "2001", "2002", "2003", "2004", "2005", "2006", "2007","2008", "2009", "2010", "2011", "2012", "2013", "2014", "2015","2016"], 'Mortality': ['11000', '10000' ,'10000' ,'10000' ,'10000' ,'9300' ,'8900' ,'8700', '9000' , '8600' ,'8300' ,'8100','7800' ,'8000', '7500', '7500', '7300']}
有什么建议可以将其转换为一个大的柱形图,并让不同颜色的国家?
这是我将数据集合在一起并将其打印出来的不良尝试。
df4 = pd.DataFrame.from_dict(datasetArgentina)
df5 = pd.DataFrame.from_dict(datasetColumbia)
df6 = pd.DataFrame.from_dict(datasetBrazil)
df7 = pd.merge(df4, df5, on='Year')
df8 = pd.merge(df6, df7, on='Year', how='left')
print(df7)
print(df8)
plt.bar(df8['Year'], df8['Mortality'])
plt.title('South America')
plt.xticks(df8['Year'], rotation=90)
plt.xlabel('Year')
plt.ylabel('Mortality')
plt.tight_layout()
plt.show()
任何帮助都会很棒。
输出:
df7 Mortality_x Year Mortality_y
0 11000 2000 1500
1 10000 2001 1600
2 10000 2002 1500
3 10000 2003 1600
4 10000 2004 1500
5 9300 2005 1200
6 8900 2006 1300
7 8700 2007 1400
8 9000 2008 1400
9 8600 2009 1500
10 8300 2010 1500
11 8100 2011 1500
12 7800 2012 1600
13 8000 2013 1500
14 7500 2014 1500
15 7500 2015 1400
16 7300 2016 1400
df8 Mortality Year Mortality_x Mortality_y
0 11000 2000 11000 1500
1 10000 2001 10000 1600
2 10000 2002 10000 1500
3 10000 2003 10000 1600
4 10000 2004 10000 1500
5 9300 2005 9300 1200
6 8900 2006 8900 1300
7 8700 2007 8700 1400
8 9000 2008 9000 1400
9 8600 2009 8600 1500
10 8300 2010 8300 1500
11 8100 2011 8100 1500
12 7800 2012 7800 1600
13 8000 2013 8000 1500
14 7500 2014 7500 1500
15 7500 2015 7500 1400
16 7300 2016 7300 1400
答案 0 :(得分:2)
使用concat
连接您的数据框,然后使用groupby
+ plot
按国家/地区对其进行分组和绘图:
df = pd.concat(
[df4, df5, df6], keys=['Argentina', 'Columbia', 'Brazil']
)
df.astype(int).groupby(level=0).plot.bar(x='Year', y='Mortality');
plt.show()
这为您提供了每组的单独图表。
答案 1 :(得分:1)
您可以seaborn
使用factorplot
,如下所示:
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
datasetArgentina = {'Year': ["2000", "2001", "2002", "2003", "2004", "2005", "2006", "2007","2008", "2009", "2010", "2011", "2012", "2013", "2014", "2015","2016"], 'Mortality': ['11000', '10000' ,'10000' ,'10000' ,'10000' ,'9300' ,'8900' ,'8700', '9000' , '8600' ,'8300' ,'8100','7800' ,'8000', '7500', '7500', '7300']}
datasetColumbia = {'Year': ["2000", "2001", "2002", "2003", "2004", "2005", "2006", "2007","2008", "2009", "2010", "2011", "2012", "2013", "2014", "2015","2016"], 'Mortality': ['1500 ','1600', '1500' ,'1600' ,'1500', '1200' ,'1300', '1400' ,'1400', '1500' ,'1500' ,'1500' ,'1600' ,'1500', '1500', '1400', '1400']}
datasetBrazil = {'Year': ["2000", "2001", "2002", "2003", "2004", "2005", "2006", "2007","2008", "2009", "2010", "2011", "2012", "2013", "2014", "2015","2016"], 'Mortality': ['11000', '10000' ,'10000' ,'10000' ,'10000' ,'9300' ,'8900' ,'8700', '9000' , '8600' ,'8300' ,'8100','7800' ,'8000', '7500', '7500', '7300']}
df4 = pd.DataFrame(datasetArgentina)
df5 = pd.DataFrame(datasetColumbia)
df6 = pd.DataFrame(datasetBrazil)
附加代码:
# add country field for each dataframe
df4['country'] = 'Argentina'
df5['country'] = 'Columbia'
df6['country'] = 'Brazil'
# Combine all dataframes
df = pd.concat([df4,df5,df6])
# convert to float
df['Mortality'] = df['Mortality'].astype(float)
sns.factorplot(data=df, hue='country', x='Year', y='Mortality', kind='bar', ci=None, aspect=3, size=7);
plt.xticks(rotation=45);
结果(了解更多信息,您可以查看seaborn
和factorplot
):