PYTHON:重现此图的最佳方法?

时间:2020-06-09 10:30:00

标签: python r matplotlib graph seaborn

我正在尝试重现类似于我在网上发现的一段非常不错的情节(made on R):

enter image description here

我正在尝试寻找在Python中获得相同结果的方法。 到目前为止,我设法使用seaborn stripplot,seaborn pointplot和axvline作为中位数来生成以下内容:

enter image description here

除了数据预处理(我现在还不知道结果),我想知道如何将每个类别的中点之间的彩色线添加到垂直中位数。

我应该以某种方式使用中值的棒棒糖图代替点图吗?

编辑:由于Sheldore的输入,我使用了hlines并得到了以下结果:

enter image description here

下面的完整代码:

# create rank
ranks = merged_df.groupby("region")["Value"].mean().fillna(0).sort_values(ascending=True)[::1].index
# for the hlines later
range_plot = range(0,len(ranks))

#Create figure
plt.figure(figsize = (12,7))

# define colors  https://learnui.design/tools/data-color-picker.html#palette
#colors= ['#2a6d85','#198992','#3ba490','#74bc84','#b6cf78','#ffdc7a']
colors= ['#003f5c','#444e86','#955196','#dd5182','#ff6e54','#ffa600']
sns.set_palette(sns.color_palette(colors))
sns.set_context("paper")

# Set the font to be serif, rather than sans
sns.set(font='serif')
# Make the background white, and specify the
# specific font family
sns.set_style("white", {
        "font.family": "serif",
        "font.serif": ["Times", "Palatino", "serif"]})

#Create stripplot
ax = sns.stripplot(x='Value',
              y='region',
              data=merged_df,
              palette=sns.color_palette(colors),
              size=6,
              linewidth=0.4,
              alpha=.15,
              zorder=1,
              order = ranks)
#Create Conditional means
ax = sns.pointplot(x="Value", 
              y="region",
              data=merged_df,
              palette=sns.color_palette(colors),
              scale=2,
              ci=None,
              edgecolors="red",
              linewidth=4,
              order = ranks,
              zorder=3)
# add median line
ax = plt.axvline(merged_df.Value.mean(),
            color='grey',
            linestyle='dashed',
            linewidth=1,
            zorder=0)
plt.text(x=merged_df.Value.mean()+1,
         y=-0.1,
         s= 'Mean: {number:.{digits}f}'.format(number=merged_df.Value.mean(),digits=0))
# Add category line
mean = merged_df.Value.mean()
x_arr = merged_df.groupby("region")["Value"].mean().fillna(0).sort_values(ascending=True)
plt.hlines(y=range_plot,
           xmin=mean,
           xmax=x_arr,
           colors=colors,
           linewidth=3,
           zorder=3)

# Add the title
plt.text(x= 4.2,
         y= -0.65,
         s = '{}'.format(merged_df.Indicator.iloc[0]),
         fontsize = 22)
# We change the aspect of ticks label and labels 
plt.tick_params(axis='both', which='major', labelsize=15)
plt.tick_params(axis='both', which='minor', labelsize=15)
plt.xlabel('Student to teacher ratio',fontsize=15)
plt.ylabel('')

# Add the source
plt.text(x= merged_df.Value.max()-25,
         y= 6.4,
         s = 'Data: UNESCO institute for statistics',fontsize = 12, color = 'grey')

plt.tight_layout(rect=[0, 0, 1, 0.95])
plt.savefig("UNESCO.jpeg", transparent=True, dpi=300)

1 个答案:

答案 0 :(得分:1)

您还有以下选择:

1)使用here

所示的垂直棒棒糖图

2)或使用plt.hlines从垂直中位数(24)到点here绘制每个国家/地区的水平线。对后一个示例的修改可能类似于

import numpy
from matplotlib import pyplot

mean = 24

x_arr = mean - numpy.random.randint(-10, 10, 10)
y_arr = numpy.arange(10)

pyplot.hlines(y_arr, mean, x_arr, color='red')
pyplot.plot(x_arr, y_arr, 'o')  
pyplot.axvline(mean, 0, 1, color='k', linestyle = '--')  
plt.xlim(8, 82)

enter image description here