使用matplotlib获取叠加的直方图

时间:2018-08-02 07:22:08

标签: python pandas matplotlib kaggle

我是python的新手,我正在尝试为Kaggle中的受控数据集绘制覆盖的直方图。我尝试用matplotlib来做。该数据集显示了近年来美国枪支暴力的历史。我只为EDA选择了几列。

 import pandas as pd

 data_set = pd.read_csv("C:/Users/Lenovo/Documents/R related 
 Topics/Assignment/Assignment_day2/04 Assignment/GunViolence.csv")
 state_wise_crime = data_set[['date', 'state', 'n_killed', 'n_injured']]

 date_value = pd.to_datetime(state_wise_crime['date'])

 import datetime

 state_wise_crime['Month']= date_value.dt.month
 state_wise_crime.drop('date', axis = 1)

 no_of_killed = state_wise_crime.groupby(['state','Year']) 
 ['n_killed','n_injured'].sum()

 no_of_killed = state_wise_crime.groupby(['state','Year'] 
 ['n_killed','n_injured'].sum()

I want an overlaid histogram that shows the no. of people killed and no.of people injured with the different states on the x-axis

1 个答案:

答案 0 :(得分:0)

欢迎堆栈溢出!从下一次开始,请以以下格式发布您的数据(而不是链接或图片),以使我们更轻松地解决问题。另外,如果您询问图形输出,则显示所需图形的内容(甚至带有手绘图)也将非常有帮助:)


df

    state   Year    n_killed    n_injured
0   Alabama 2013    9           3
1   Alabama 2014    591         325
2   Alabama 2015    562         385
3   Alabama 2016    761         488
4   Alabama 2017    856         544
5   Alabama 2018    219         135
6   Alaska  2014    49          29
7   Alaska  2015    84          70
8   Alaska  2016    103         88
9   Alaska  2017    70          69

正如我在您的原始帖子中所评论的,在这种情况下,条形图比直方图更合适,因为您的目的似乎是通过状态比较可视化每年的汇总统计(总和)。据我所知,最简单的选择是使用Seaborn。这取决于您要如何显示数据,但是下面是一个示例。代码很简单,如下所示。

import seaborn as sns    
sns.barplot(x='Year', y='n_killed', hue='state', data=df)

输出:

enter image description here

希望这会有所帮助。