我有Telco-Customer Churn dataset。在根据不同的任期持续时间分析流失率之后,我想把它想象成如下图所示,其中只有针对不同“任期”的“任期”绘制了搅动的数量。
以下是我的尝试:
import pandas as pd
import matplotlib.pyplot as plt
user_data = pd.read_csv("https://github.com/WedamN/Telco-Churn-Prediction/blob/master/CustomerChurnData.csv")
# bin the tenure into every 6 months
user_data['tenure_bin'] = pd.cut(user_data['Tenure'], list(range(0, 73, 6)))
# some basic analysis
churn_rate_according_to_tenure = user_data.groupby('tenure_bin').Churn.value_counts('Yes')*100
# plot the results
churn_rate_according_to_tenure.plot().bar()
plt.show()
这是我获得的情节(有点凌乱),其中显示了“是”和“否”类别。我怎样才能解决这个问题,我只想在条形图颜色相同的情况下只显示“是”类别?
答案 0 :(得分:2)
如果unstack
想要将两个类别放在一起,我认为您需要重塑:
print (churn_rate_according_to_tenure.unstack())
Churn No Yes
tenure_bin
(0, 6] 46.666667 53.333333
(6, 12] 64.113475 35.886525
(12, 18] 67.700730 32.299270
(18, 24] 75.420168 24.579832
(24, 30] 78.190255 21.809745
(30, 36] 78.553616 21.446384
(36, 42] 78.100264 21.899736
(42, 48] 83.812010 16.187990
(48, 54] 83.809524 16.190476
(54, 60] 87.378641 12.621359
(60, 66] 90.712743 9.287257
(66, 72] 94.703390 5.296610
churn_rate_according_to_tenure.unstack().plot.bar()
如果想要过滤,请选择类别 - 列:
churn_rate_according_to_tenure.unstack()['Yes'].plot.bar()