绘制具有不同索引的多个数据框的平均值

时间:2020-06-21 21:26:58

标签: python pandas dataframe indexing

我是Python的新手,希望能获得一些帮助。我正在处理包含随时间变化的测量的数据框,例如:

a, b = 0.5, 1.5
mu, sigma = 1, 0.1
dist = stats.truncnorm((a - mu) / sigma, (b - mu) / sigma, loc=mu, scale=sigma)

valuesA1 = dist.rvs(11)

a, b = 0.5, 1.5
mu, sigma = 1, 0.1
dist = stats.truncnorm((a - mu) / sigma, (b - mu) / sigma, loc=mu, scale=sigma)

valuesA4 = dist.rvs(11)

a, b = 1.5, 2.5
mu, sigma = 2, 0.2
dist = stats.truncnorm((a - mu) / sigma, (b - mu) / sigma, loc=mu, scale=sigma)

valuesA2 = dist.rvs(11)

a, b = 1.5, 2.5
mu, sigma = 2, 0.2
dist = stats.truncnorm((a - mu) / sigma, (b - mu) / sigma, loc=mu, scale=sigma)

valuesA5 = dist.rvs(11)

a, b = 2.5, 3.5
mu, sigma = 3, 0.3
dist = stats.truncnorm((a - mu) / sigma, (b - mu) / sigma, loc=mu, scale=sigma)

valuesA3 = dist.rvs(11)

a, b = 2.5, 3.5
mu, sigma = 3, 0.3
dist = stats.truncnorm((a - mu) / sigma, (b - mu) / sigma, loc=mu, scale=sigma)

valuesA6 = dist.rvs(11)

df1 = {'A1': valuesA1,
       'A2': valuesA2,
      'A3': valuesA3,
      'A4': valuesA4,
      'A5': valuesA5,
      'A6': valuesA6}
df1 = pd.DataFrame(df1, columns = ['A1','A2', 'A3', 'A4', 'A5', 'A6'], index=[60.012, 120.45, 180.21, 240.42, 300.619, 360.67, 420.87, 480.65, 540.86, 600.35, 660.61])
df1.index.name = 'time'

df1

我正在绘制一段时间内的组平均值,如下所示:

group_1 = ['A1','A4']
group_2= ['A2','A5']
group_3 = ['A3','A6']
ID_list = [group_1, group_2, group_3]
labels = ['group_1', 'group_2', 'group_3']
colors = ["lightcoral", "teal", "yellowgreen"]

pl.plot()

for group_ID in [0, 1, 2]:

    # Select time range 
    
    df_selected = df1[ID_list[group_ID]]

    result = df_selected.aggregate(["mean", "sem"], axis=1)
    pl.plot(result.index, result["mean"].values, '-', color=colors[group_ID], label=labels[group_ID], alpha=1)
    pl.fill_between(result.index, result["mean"]-result["sem"],result["mean"]+result["sem"], color=colors[group_ID], alpha=0.1)

   
pl.legend()
pl.title("Title")
pl.ylabel("Y Axis")
pl.xlabel("Time (s)")
pl.ylim(0, 4)

现在我的问题来了。我想绘制多个数据帧的组平均值,这些数据是在略有不同的时间点进行测量的。例如。第二个数据帧:

a, b = 0.5, 1.5
mu, sigma = 1, 0.1
dist = stats.truncnorm((a - mu) / sigma, (b - mu) / sigma, loc=mu, scale=sigma)

valuesA1 = dist.rvs(11)

a, b = 0.5, 1.5
mu, sigma = 1, 0.1
dist = stats.truncnorm((a - mu) / sigma, (b - mu) / sigma, loc=mu, scale=sigma)

valuesA4 = dist.rvs(11)

a, b = 1.5, 2.5
mu, sigma = 2, 0.2
dist = stats.truncnorm((a - mu) / sigma, (b - mu) / sigma, loc=mu, scale=sigma)

valuesA2 = dist.rvs(11)

a, b = 1.5, 2.5
mu, sigma = 2, 0.2
dist = stats.truncnorm((a - mu) / sigma, (b - mu) / sigma, loc=mu, scale=sigma)

valuesA5 = dist.rvs(11)

a, b = 2.5, 3.5
mu, sigma = 3, 0.3
dist = stats.truncnorm((a - mu) / sigma, (b - mu) / sigma, loc=mu, scale=sigma)

valuesA3 = dist.rvs(11)

a, b = 2.5, 3.5
mu, sigma = 3, 0.3
dist = stats.truncnorm((a - mu) / sigma, (b - mu) / sigma, loc=mu, scale=sigma)

valuesA6 = dist.rvs(11)

df2 = {'A1': valuesA1,
       'A2': valuesA2,
      'A3': valuesA3,
      'A4': valuesA4,
      'A5': valuesA5,
      'A6': valuesA6}
df2 = pd.DataFrame(df2, columns = ['A1','A2', 'A3', 'A4', 'A5', 'A6'], index=[60.2, 120.78, 180.54, 240.63, 300.19, 360.77, 420.45, 480.2, 540.55, 600.3, 660.01])
df2.index.name = 'time'

df2

如何绘制多个数据帧的组平均值(group_1是df1的A1和A4,以及df2的A1和A4,依此类推)?我本来会使用串联的,但是随着索引的变化,这似乎不是正确的选择。

0 个答案:

没有答案