情节:如何使用分组依据创建条形图?

时间:2019-11-13 05:32:17

标签: python plotly bar-chart

我的数据集如下:

import pandas as pd
data = dict(Pclass=[1,1,2,2,3,3],
            Survived = [0,1,0,1,0,1],
            CategorySize = [80,136,97,87,372,119] )

我需要在 python 中使用barchart创建一个plotly,该文件按 Pclass 分组。在每个组中,我分别为Survived=0Survived=1有2列,在Y轴上,我应该有CategorySize。因此,我必须有3组中的6条。

这是我尝试过的:

import plotly.offline as pyo
import plotly.graph_objects as go

data = [ go.Bar( x = PclassSurvived.Pclass, y = PclassSurvived.CategorySize ) ]
layout = go.Layout(title= 'Pclass-Survived', xaxis = dict(title = 'Pclass'), yaxis = dict(title = 'CategorySize'),barmode='group' )
fig = go.Figure(data = data, layout = layout)

pyo.plot( fig, filename='./Output/Pclass-Survived.html')

但是,这不是我所需要的。

2 个答案:

答案 0 :(得分:1)

我在处理您的样本数据集时遇到麻烦。 EVM APIPclassSurvived.Pclass尚未定义,对我来说100%不清楚你想在这里完成什么。但是从您的解释和数据集的结构来看,这似乎可以帮助您:

情节1:

enter image description here

代码1:

PclassSurvived.CategorySize

编辑:您可以使用# imports from plotly.subplots import make_subplots import plotly.figure_factory as ff import plotly.graph_objs as go import pandas as pd import numpy as np data = dict(Pclass=[1,1,2,2,3,3], Survived = [0,1,0,1,0,1], CategorySize = [80,136,97,87,372,119] ) df=pd.DataFrame(data) s0=df.query('Survived==0') s1=df.query('Survived==1') #layout = go.Layout(title= 'Pclass-Survived', xaxis = dict(title = 'Pclass'), yaxis = dict(title = 'CategorySize'),barmode='group' ) fig = go.Figure() data=data['Pclass'] fig.add_trace(go.Bar(x=s0['Pclass'], y = s0['CategorySize'], name='dead' ) ) fig.add_trace(go.Bar(x=s1['Pclass'], y = s1['CategorySize'], name='alive' ) ) fig.update_layout(barmode='group') fig.show() 模块生成相同的图,如下所示:

代码2:

plotly.offline

堆积条形的替代方法:

图2:

enter image description here

代码3:

# Import the necessaries libraries
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd

# Set notebook mode to work in offline
pyo.init_notebook_mode()

# data
data = dict(Pclass=[1,1,2,2,3,3],
            Survived = [0,1,0,1,0,1],
            CategorySize = [80,136,97,87,372,119] )
df=pd.DataFrame(data)

# 
s0=df.query('Survived==0')
s1=df.query('Survived==1')

fig = go.Figure()

data=data['Pclass']

fig.add_trace(go.Bar(x=s0['Pclass'], y = s0['CategorySize'],
                    name='dead'
                    )
             )

fig.add_trace(go.Bar(x=s1['Pclass'], y = s1['CategorySize'],
                    name='alive'
                    )
             )

pyo.iplot(fig, filename = 'your-library')

答案 1 :(得分:1)

这可以通过 PandasgroupbyPlotly Express.

轻松完成

您应该按 PclassSurvived 列对数据进行分组,并将 sum 聚合函数应用于 CategorySize 列。

通过这种方式,您将获得 6 个组及其聚合值,并且您可以轻松地为每个组绘制一对条形图(并排),这要归功于 barmode 属性(通过使用 {{ 1}} 值),您可以在 documentation 中阅读更多相关信息。

代码:

'group'

现在您对数据进行分组:

import pandas as pd
import plotly.express as px

data = pd.DataFrame(
    dict(
        Pclass=[1, 1, 2, 2, 3, 3],
        Survived=[0, 1, 0, 1, 0, 1],
        CategorySize=[80, 136, 97, 87, 372, 119],
    )
)

并将 grouped_df = data.groupby(by=["Pclass", "Survived"], as_index=False).agg( {"CategorySize": "sum"} ) 列值转换为字符串(因此将其视为离散变量,而不是数字变量):

Survived

现在,您应该:

<头>
Pclass 幸存 CategorySize
0 1 死了 80
1 1 幸存下来 136
2 2 死了 97
3 2 幸存下来 87
4 3 死了 372
5 3 幸存下来 119

最后,您将数据可视化:

grouped_df.Survived = grouped_df.Survived.map({0: "Died", 1: "Survived",})

enter image description here