我有一个由均值和分布的std-dev组成的数据框
df.head()
+---+---------+----------------+-------------+---------------+------------+
| | user_id | session_id | sample_mean | sample_median | sample_std |
+---+---------+----------------+-------------+---------------+------------+
| 0 | 1 | 20081023025304 | 4.972789 | 5 | 0.308456 |
| 1 | 1 | 20081023025305 | 5.000000 | 5 | 1.468418 |
| 2 | 1 | 20081023025306 | 5.274419 | 5 | 4.518189 |
| 3 | 1 | 20081024020959 | 4.634855 | 5 | 1.387244 |
| 4 | 1 | 20081026134407 | 5.088195 | 5 | 2.452059 |
+---+---------+----------------+-------------+---------------+------------+
由此,我绘制了分布的直方图
plt.hist(df['sample_mean'],bins=50)
plt.xlabel('sampling rate (sec)')
plt.ylabel('Frequency')
plt.title('Histogram of trips mean sampling rate')
plt.show()
然后我写一个函数来计算pdf
和cdf
,并传递数据框和列名:
def compute_distrib(df, col):
stats_df = df.groupby(col)[col].agg('count').pipe(pd.DataFrame).rename(columns = {col: 'frequency'})
# PDF
stats_df['pdf'] = stats_df['frequency'] / sum(stats_df['frequency'])
# CDF
stats_df['cdf'] = stats_df['pdf'].cumsum()
stats_df = stats_df.reset_index()
return stats_df
例如:
stats_df = compute_distrib(df, 'sample_mean')
stats_df.head(2)
+---+---------------+-----------+----------+----------+
| | sample_median | frequency | pdf | cdf |
+---+---------------+-----------+----------+----------+
| 0 | 1 | 4317 | 0.143575 | 0.143575 |
| 1 | 2 | 10169 | 0.338200 | 0.481775 |
+---+---------------+-----------+----------+----------+
然后我以这种方式绘制cdf
分布:
ax1 = stats_df.plot(x = 'sample_mean', y = ['cdf'], grid = True)
ax1.legend(loc='best')
目标: 我想将这些图形并排绘制在一个图形中,而不是分别绘制并以某种方式将它们放到幻灯片中。
答案 0 :(得分:2)
您可以使用matplotlib.pyplot.subplots
并排绘制多个图:
import matplotlib.pyplot as plt
fig, axs = plt.subplots(nrows=1, ncols=2)
# Pass the data you wish to plot.
axs[0][0].hist(...)
axs[0][1].plot(...)
plt.show()