Question

我的数据格式如下：

|      | Measurement 1 |      | Measurement 2 |      |
|------|---------------|------|---------------|------|
|      | Mean          | Std  | Mean          | Std  |
| Time |               |      |               |      |
| 0    | 17            | 1.10 | 21            | 1.33 |
| 1    | 16            | 1.08 | 21            | 1.34 |
| 2    | 14            | 0.87 | 21            | 1.35 |
| 3    | 11            | 0.86 | 21            | 1.33 |

我使用以下代码从此数据生成matplotlib折线图，其中标准偏差显示为填充区域，如下所示：

def seconds_to_minutes(x, pos):
    minutes = f'{round(x/60, 0)}'
    return minutes

fig, ax = plt.subplots()
mean_temperature_over_time['Measurement 1']['mean'].plot(kind='line', yerr=mean_temperature_over_time['Measurement 1']['std'], alpha=0.15, ax=ax)
mean_temperature_over_time['Measurement 2']['mean'].plot(kind='line', yerr=mean_temperature_over_time['Measurement 2']['std'], alpha=0.15, ax=ax)

ax.set(title="A Line Graph with Shaded Error Regions", xlabel="x", ylabel="y")
formatter = FuncFormatter(seconds_to_minutes)
ax.xaxis.set_major_formatter(formatter)
ax.grid()
ax.legend(['Mean 1', 'Mean 2'])

输出：

这似乎是一个非常混乱的解决方案，实际上只产生阴影输出，因为我有这么多的数据。从具有着色错误区域的数据框生成折线图的正确方法是什么？我查看了Plot yerr/xerr as shaded region rather than error bars，但无法根据我的情况进行调整。

Answer 1

链接解决方案有什么问题？这似乎很简单。

请允许我重新安排您的数据集，以便更容易加载到Pandas DataFrame

   Time  Measurement  Mean   Std
0     0            1    17  1.10
1     1            1    16  1.08
2     2            1    14  0.87
3     3            1    11  0.86
4     0            2    21  1.33
5     1            2    21  1.34
6     2            2    21  1.35
7     3            2    21  1.33

for i, m in df.groupby("Measurement"):
    ax.plot(m.Time, m.Mean)
    ax.fill_between(m.Time, m.Mean - m.Std, m.Mean + m.Std, alpha=0.35)

以下是一些随机生成数据的结果：

修改

由于问题显然是针对您的特定数据帧格式进行迭代，因此让我展示一下您是如何做到的（我是pandas的新手，因此可能有更好的方法）。如果我理解你的截图你应该有：

Measurement 1 2 Mean Std Mean Std Time 0 17 1.10 21 1.33 1 16 1.08 21 1.34 2 14 0.87 21 1.35 3 11 0.86 21 1.33 df.info() <class 'pandas.core.frame.DataFrame'> Int64Index: 4 entries, 0 to 3 Data columns (total 4 columns): (1, Mean) 4 non-null int64 (1, Std) 4 non-null float64 (2, Mean) 4 non-null int64 (2, Std) 4 non-null float64 dtypes: float64(2), int64(2) memory usage: 160.0 bytes df.columns MultiIndex(levels=[[1, 2], [u'Mean', u'Std']], labels=[[0, 0, 1, 1], [0, 1, 0, 1]], names=[u'Measurement', None])

你应该能够迭代它并获得相同的情节：

for i, m in df.groupby("Measurement"): ax.plot(m["Time"], m['Mean']) ax.fill_between(m["Time"], m['Mean'] - m['Std'], m['Mean'] + m['Std'], alpha=0.35)

或者您可以使用
将其重新包装到上面的格式中
(df.stack("Measurement") # stack "Measurement" columns row by row .reset_index() # make "Time" a normal column, add a new index .sort_values("Measurement") # group values from the same Measurement .reset_index(drop=True)) # drop sorted index and make a new one

来自Pandas Agg的阴影误差条

1 个答案: