Question

我正在使用以下代码创建三个不同变量的直方图。我想在每个数据点上将三个栏分开，以获得更好的可视化效果。我尝试为每个功能添加“ position”参数，但无法正常工作

count, bin_edges = np.histogram(df['age'])

fig = plt.figure(figsize=(7,3))
ax = fig.add_subplot(111) # Create matplotlib axes

df['age'].plot(kind = 'hist', figsize=(10,5), xticks = bin_edges, 
               width = 2, color = 'blue', alpha=0.4)

df[df['y'] == 1]['age'].plot(kind = 'hist', figsize=(10,5), xticks = bin_edges, 
               width = 2, color='red', alpha=0.4)

df[(df['y'] == 1)&(df['new_customer'] == 1)]['age'].plot(kind = 'hist', figsize=(10,5), xticks = bin_edges, 
               width = 2, color='green', alpha=0.4)

plt.title("Age")
plt.xlabel("Age Bins")
plt.ylabel("Number of Contacts")
plt.legend(loc='upper right')
plt.show()

编辑：这就是我的df的样子：

df[['age', 'y', 'new_customer']]


   age  y   new_customer
0   56  0   1
1   57  0   1
2   37  0   1
3   40  0   1
4   56  0   1
5   45  0   1
6   59  0   1
7   41  0   1
8   24  0   1
9   25  0   1
10  41  0   1
11  25  0   1
12  29  0   1

Answer 1

熊猫绘图API几乎不像其用来制作实际绘图的基础Matplotlib库那样灵活。只需直接使用Matplotlib：

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

csv = '''   age  y   new_customer
0   56  0   1
1   57  1   1
2   37  0   1
3   40  0   1
4   56  1   1
5   45  0   0
6   59  0   1
7   41  1   1
8   24  0   0
9   25  0   1
10  41  1   1
11  25  0   0
12  29  0   1'''

df = pd.read_csv(pd.compat.StringIO(csv), sep='\s+')

bin_edges = np.histogram_bin_edges(df['age'])

fig = plt.figure(figsize=(7,3))
ax = fig.add_subplot(111) # Create matplotlib axes

data = [df['age'], 
        df[df['y'] == 1]['age'],
        df[(df['y'] == 1)&(df['new_customer'] == 1)]['age']]
plt.hist(data, bins=bin_edges, label=['age', 'age_y', 'age_y_newcustomer'])

bin_cens = (bin_edges[:-1] + bin_edges[1:])/2
plt.xticks(bin_cens)

plt.title("Age")
plt.xlabel("Age Bins (center)")
plt.ylabel("Number of Contacts")
plt.legend()
plt.show()

输出：

如何拆解这三个直方图列？位置参数失败

1 个答案: