我的数据集如下:
import pandas as pd
data = dict(Pclass=[1,1,2,2,3,3],
Survived = [0,1,0,1,0,1],
CategorySize = [80,136,97,87,372,119] )
我需要在 python 中使用barchart
创建一个plotly
,该文件按 Pclass 分组。在每个组中,我分别为Survived=0
和Survived=1
有2列,在Y轴上,我应该有CategorySize
。因此,我必须有3组中的6条。
这是我尝试过的:
import plotly.offline as pyo
import plotly.graph_objects as go
data = [ go.Bar( x = PclassSurvived.Pclass, y = PclassSurvived.CategorySize ) ]
layout = go.Layout(title= 'Pclass-Survived', xaxis = dict(title = 'Pclass'), yaxis = dict(title = 'CategorySize'),barmode='group' )
fig = go.Figure(data = data, layout = layout)
pyo.plot( fig, filename='./Output/Pclass-Survived.html')
但是,这不是我所需要的。
答案 0 :(得分:1)
我在处理您的样本数据集时遇到麻烦。 EVM API
和PclassSurvived.Pclass
尚未定义,对我来说100%不清楚你想在这里完成什么。但是从您的解释和数据集的结构来看,这似乎可以帮助您:
情节1:
代码1:
PclassSurvived.CategorySize
编辑:您可以使用# imports
from plotly.subplots import make_subplots
import plotly.figure_factory as ff
import plotly.graph_objs as go
import pandas as pd
import numpy as np
data = dict(Pclass=[1,1,2,2,3,3],
Survived = [0,1,0,1,0,1],
CategorySize = [80,136,97,87,372,119] )
df=pd.DataFrame(data)
s0=df.query('Survived==0')
s1=df.query('Survived==1')
#layout = go.Layout(title= 'Pclass-Survived', xaxis = dict(title = 'Pclass'), yaxis = dict(title = 'CategorySize'),barmode='group' )
fig = go.Figure()
data=data['Pclass']
fig.add_trace(go.Bar(x=s0['Pclass'], y = s0['CategorySize'],
name='dead'
)
)
fig.add_trace(go.Bar(x=s1['Pclass'], y = s1['CategorySize'],
name='alive'
)
)
fig.update_layout(barmode='group')
fig.show()
模块生成相同的图,如下所示:
代码2:
plotly.offline
堆积条形的替代方法:
图2:
代码3:
# Import the necessaries libraries
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
# Set notebook mode to work in offline
pyo.init_notebook_mode()
# data
data = dict(Pclass=[1,1,2,2,3,3],
Survived = [0,1,0,1,0,1],
CategorySize = [80,136,97,87,372,119] )
df=pd.DataFrame(data)
#
s0=df.query('Survived==0')
s1=df.query('Survived==1')
fig = go.Figure()
data=data['Pclass']
fig.add_trace(go.Bar(x=s0['Pclass'], y = s0['CategorySize'],
name='dead'
)
)
fig.add_trace(go.Bar(x=s1['Pclass'], y = s1['CategorySize'],
name='alive'
)
)
pyo.iplot(fig, filename = 'your-library')
答案 1 :(得分:1)
这可以通过 Pandas
的 groupby
和 Plotly Express.
您应该按 Pclass
和 Survived
列对数据进行分组,并将 sum 聚合函数应用于 CategorySize
列。
通过这种方式,您将获得 6 个组及其聚合值,并且您可以轻松地为每个组绘制一对条形图(并排),这要归功于 barmode
属性(通过使用 {{ 1}} 值),您可以在 documentation 中阅读更多相关信息。
代码:
'group'
现在您对数据进行分组:
import pandas as pd
import plotly.express as px
data = pd.DataFrame(
dict(
Pclass=[1, 1, 2, 2, 3, 3],
Survived=[0, 1, 0, 1, 0, 1],
CategorySize=[80, 136, 97, 87, 372, 119],
)
)
并将 grouped_df = data.groupby(by=["Pclass", "Survived"], as_index=False).agg(
{"CategorySize": "sum"}
)
列值转换为字符串(因此将其视为离散变量,而不是数字变量):
Survived
现在,您应该:
Pclass | 幸存 | CategorySize | |
---|---|---|---|
0 | 1 | 死了 | 80 |
1 | 1 | 幸存下来 | 136 |
2 | 2 | 死了 | 97 |
3 | 2 | 幸存下来 | 87 |
4 | 3 | 死了 | 372 |
5 | 3 | 幸存下来 | 119 |
最后,您将数据可视化:
grouped_df.Survived = grouped_df.Survived.map({0: "Died", 1: "Survived",})