熊猫在一个图中绘制了来自两个数据框的列

时间:2020-07-01 11:00:25

标签: python pandas matplotlib

我有一个由均值和分布的std-dev组成的数据框

df.head()
+---+---------+----------------+-------------+---------------+------------+
|   | user_id |   session_id   | sample_mean | sample_median | sample_std |
+---+---------+----------------+-------------+---------------+------------+
| 0 |       1 | 20081023025304 | 4.972789    |             5 | 0.308456   |
| 1 |       1 | 20081023025305 | 5.000000    |             5 | 1.468418   |
| 2 |       1 | 20081023025306 | 5.274419    |             5 | 4.518189   |
| 3 |       1 | 20081024020959 | 4.634855    |             5 | 1.387244   |
| 4 |       1 | 20081026134407 | 5.088195    |             5 | 2.452059   |
+---+---------+----------------+-------------+---------------+------------+

由此,我绘制了分布的直方图

plt.hist(df['sample_mean'],bins=50)
plt.xlabel('sampling rate (sec)')
plt.ylabel('Frequency')
plt.title('Histogram of trips mean sampling rate')
plt.show()

enter image description here

然后我写一个函数来计算pdfcdf,并传递数据框和列名:

def compute_distrib(df, col):
    stats_df = df.groupby(col)[col].agg('count').pipe(pd.DataFrame).rename(columns = {col: 'frequency'})
    
    # PDF
    stats_df['pdf'] = stats_df['frequency'] / sum(stats_df['frequency'])
    
    # CDF
    stats_df['cdf'] = stats_df['pdf'].cumsum()
    stats_df = stats_df.reset_index()
    return stats_df

例如:

  stats_df = compute_distrib(df, 'sample_mean')
  stats_df.head(2)
+---+---------------+-----------+----------+----------+
|   | sample_median | frequency |   pdf    |   cdf    |
+---+---------------+-----------+----------+----------+
| 0 |             1 |      4317 | 0.143575 | 0.143575 |
| 1 |             2 |     10169 | 0.338200 | 0.481775 |
+---+---------------+-----------+----------+----------+

然后我以这种方式绘制cdf分布:

ax1 = stats_df.plot(x = 'sample_mean', y = ['cdf'], grid = True)
ax1.legend(loc='best')

enter image description here

目标: 我想将这些图形并排绘制在一个图形中,而不是分别绘制并以某种方式将它们放到幻灯片中。

1 个答案:

答案 0 :(得分:2)

您可以使用matplotlib.pyplot.subplots并排绘制多个图:

import matplotlib.pyplot as plt

fig, axs = plt.subplots(nrows=1, ncols=2)

# Pass the data you wish to plot.
axs[0][0].hist(...)
axs[0][1].plot(...)

plt.show()