多个熊猫数据框列的重叠密度图

时间:2019-05-29 22:38:35

标签: python python-2.7 matplotlib plot histogram

import numpy as np
import pandas as pd

col1 = np.random.normal(0, 1, (1000, ))
col2 = np.random.normal(0, 1, (1000, ))
col3 = np.random.normal(0, 1, (1000, ))
df = pd.DataFrame({'col1':col1, 'col2':col2, 'col3':col3})
  • 将每列绘制为连续线
  • 在同一轴上绘制所有3列
  • 使用不同的彩色线条(无填充)

谢谢!

1 个答案:

答案 0 :(得分:1)

我了解您的问题!这就是我在matplotlib中执行的操作。

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

col1 = np.random.normal(0, 1, (1000, ))
col2 = np.random.normal(1, 1, (1000, ))
col3 = np.random.normal(-1, 1, (1000, ))
df = pd.DataFrame({'col1':col1, 'col2':col2, 'col3':col3})

df['col1_bins'] = pd.cut(df['col1'], bins=np.arange(-10, 11, 0.5))
df['col2_bins'] = pd.cut(df['col2'], bins=np.arange(-10, 11, 0.5))
df['col3_bins'] = pd.cut(df['col3'], bins=np.arange(-10, 11, 0.5))

col1_counts = df[['col1_bins', 'col1']].groupby(['col1_bins']).count().reset_index()
col2_counts = df[['col2_bins', 'col1']].groupby(['col2_bins']).count().reset_index()
col3_counts = df[['col3_bins', 'col1']].groupby(['col3_bins']).count().reset_index()

plt.plot(col1_counts['col1_bins'].astype(str), col1_counts['col1'], 'r')
plt.plot(col2_counts['col2_bins'].astype(str), col2_counts['col1'], 'b')
plt.plot(col3_counts['col3_bins'].astype(str), col3_counts['col1'], 'g')

基本上,您必须先对数据点进行分类,然后才能绘制它们。