我在seaborn中有以下情节:
df = pandas.DataFrame({"sample": ["X", "X", "X", "Y", "Y", "Y"],
"value": [0.2, 0.3, 0.4, 0.7, 0.75, 0.8],
"rep": ["a", "b", "c", "a", "b", "c"]})
plt.figure()
ax = sns.stripplot(x="sample", y="value", edgecolor="none",
hue="sample", palette="Set1", data=df)
# how to plot median line?
plt.show()
它以灰度颜色绘制点而不是使用Set1
,并且仅在图例中显示X
而不是Y
:
我还想在X
和Y
的中间位置添加一条水平线。如何才能做到这一点? factorplot
似乎没有水平线选项。
答案 0 :(得分:2)
您可以使用matplolib绘制线条。 Pandas可以计算数据集的中位数值。我在这个例子中使用了seaborn 0.7.0:
from pandas import DataFrame
import matplotlib.pyplot as plt
import seaborn as sns
df = DataFrame({"sample": ["X", "X", "X", "Y", "Y", "Y"],
"value": [0.2, 0.3, 0.4, 0.7, 0.75, 0.8],
"rep": ["a", "b", "c", "a", "b", "c"]})
# calc medians
xmed = df.loc[df["sample"] == 'X'].median()['value']
ymed = df.loc[df["sample"] == 'Y'].median()['value']
sns.stripplot(x="sample", y="value", edgecolor="none",
hue="sample", palette="Set1", data=df)
x = plt.gca().axes.get_xlim()
# how to plot median line?
plt.plot(x, len(x) * [xmed], sns.xkcd_rgb["pale red"])
plt.plot(x, len(x) * [ymed], sns.xkcd_rgb["denim blue"])
plt.show()
答案 1 :(得分:2)
我们可以通过在生成stripplot后循环Axes ticks和ticklabels来限制每条中线的宽度到各自的列。这也使代码能够独立于要绘制的样本(列)的数量进行操作。
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame({"sample": ["X", "X", "X", "Y", "Y", "Y"],
"value": [0.2, 0.3, 0.4, 0.7, 0.75, 0.8],
"rep": ["a", "b", "c", "a", "b", "c"]})
ax = sns.stripplot(x="sample", y="value", data=df, palette="Set1", s=8)
# distance across the "X" or "Y" stipplot column to span, in this case 40%
median_width = 0.4
for tick, text in zip(ax.get_xticks(), ax.get_xticklabels()):
sample_name = text.get_text() # "X" or "Y"
# calculate the median value for all replicates of either X or Y
median_val = df[df['sample']==sample_name].value.median()
# plot horizontal lines across the column, centered on the tick
ax.plot([tick-median_width/2, tick+median_width/2], [median_val, median_val],
lw=4, color='k')
plt.show()
seaborn stripplot中画线: