我正在使用以下代码创建三个不同变量的直方图。我想在每个数据点上将三个栏分开,以获得更好的可视化效果。我尝试为每个功能添加“ position”参数,但无法正常工作
count, bin_edges = np.histogram(df['age'])
fig = plt.figure(figsize=(7,3))
ax = fig.add_subplot(111) # Create matplotlib axes
df['age'].plot(kind = 'hist', figsize=(10,5), xticks = bin_edges,
width = 2, color = 'blue', alpha=0.4)
df[df['y'] == 1]['age'].plot(kind = 'hist', figsize=(10,5), xticks = bin_edges,
width = 2, color='red', alpha=0.4)
df[(df['y'] == 1)&(df['new_customer'] == 1)]['age'].plot(kind = 'hist', figsize=(10,5), xticks = bin_edges,
width = 2, color='green', alpha=0.4)
plt.title("Age")
plt.xlabel("Age Bins")
plt.ylabel("Number of Contacts")
plt.legend(loc='upper right')
plt.show()
编辑:这就是我的df的样子:
df[['age', 'y', 'new_customer']]
age y new_customer
0 56 0 1
1 57 0 1
2 37 0 1
3 40 0 1
4 56 0 1
5 45 0 1
6 59 0 1
7 41 0 1
8 24 0 1
9 25 0 1
10 41 0 1
11 25 0 1
12 29 0 1
答案 0 :(得分:2)
熊猫绘图API几乎不像其用来制作实际绘图的基础Matplotlib库那样灵活。只需直接使用Matplotlib:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
csv = ''' age y new_customer
0 56 0 1
1 57 1 1
2 37 0 1
3 40 0 1
4 56 1 1
5 45 0 0
6 59 0 1
7 41 1 1
8 24 0 0
9 25 0 1
10 41 1 1
11 25 0 0
12 29 0 1'''
df = pd.read_csv(pd.compat.StringIO(csv), sep='\s+')
bin_edges = np.histogram_bin_edges(df['age'])
fig = plt.figure(figsize=(7,3))
ax = fig.add_subplot(111) # Create matplotlib axes
data = [df['age'],
df[df['y'] == 1]['age'],
df[(df['y'] == 1)&(df['new_customer'] == 1)]['age']]
plt.hist(data, bins=bin_edges, label=['age', 'age_y', 'age_y_newcustomer'])
bin_cens = (bin_edges[:-1] + bin_edges[1:])/2
plt.xticks(bin_cens)
plt.title("Age")
plt.xlabel("Age Bins (center)")
plt.ylabel("Number of Contacts")
plt.legend()
plt.show()
输出: