使用stripplot绘制带有seaborn中线的点

时间:2016-06-03 17:06:30

标签: python matplotlib seaborn

我在seaborn中有以下情节:

df = pandas.DataFrame({"sample": ["X", "X", "X", "Y", "Y", "Y"],
                       "value": [0.2, 0.3, 0.4, 0.7, 0.75, 0.8],
                       "rep": ["a", "b", "c", "a", "b", "c"]})
plt.figure()
ax = sns.stripplot(x="sample", y="value", edgecolor="none",
                   hue="sample", palette="Set1", data=df)

# how to plot median line?
plt.show()

它以灰度颜色绘制点而不是使用Set1,并且仅在图例中显示X而不是Y

s

我还想在XY的中间位置添加一条水平线。如何才能做到这一点? factorplot似乎没有水平线选项。

2 个答案:

答案 0 :(得分:2)

您可以使用matplolib绘制线条。 Pandas可以计算数据集的中位数值。我在这个例子中使用了seaborn 0.7.0:

from pandas import DataFrame
import matplotlib.pyplot as plt
import seaborn as sns

df = DataFrame({"sample": ["X", "X", "X", "Y", "Y", "Y"],
                       "value": [0.2, 0.3, 0.4, 0.7, 0.75, 0.8],
                       "rep": ["a", "b", "c", "a", "b", "c"]})
# calc medians
xmed = df.loc[df["sample"] == 'X'].median()['value']
ymed = df.loc[df["sample"] == 'Y'].median()['value']

sns.stripplot(x="sample", y="value", edgecolor="none",
 hue="sample", palette="Set1", data=df)

x = plt.gca().axes.get_xlim()

# how to plot median line?
plt.plot(x, len(x) * [xmed], sns.xkcd_rgb["pale red"])
plt.plot(x, len(x) * [ymed], sns.xkcd_rgb["denim blue"])
plt.show()

enter image description here

答案 1 :(得分:2)

我们可以通过在生成stripplot后循环Axes ticks和ticklabels来限制每条中线的宽度到各自的列。这也使代码能够独立于要绘制的样本(列)的数量进行操作。


    import pandas as pd
    import seaborn as sns
    import matplotlib.pyplot as plt

    df = pd.DataFrame({"sample": ["X", "X", "X", "Y", "Y", "Y"],
                       "value": [0.2, 0.3, 0.4, 0.7, 0.75, 0.8],
                       "rep": ["a", "b", "c", "a", "b", "c"]})

    ax = sns.stripplot(x="sample", y="value", data=df, palette="Set1", s=8)

    # distance across the "X" or "Y" stipplot column to span, in this case 40%
    median_width = 0.4

    for tick, text in zip(ax.get_xticks(), ax.get_xticklabels()):
        sample_name = text.get_text()  # "X" or "Y"

        # calculate the median value for all replicates of either X or Y
        median_val = df[df['sample']==sample_name].value.median()

        # plot horizontal lines across the column, centered on the tick
        ax.plot([tick-median_width/2, tick+median_width/2], [median_val, median_val],
                lw=4, color='k')

    plt.show()

seaborn stripplot中画线: seaborn stripplot with median lines drawn