我有一个熊猫数据框df
,它有一列,我可以用这种方式为matplotlib创建一个颜色代码列
df['color-code'] = np.where(df['Community School?']=='Yes', 'blue', 'red')
我还创建了一个单独的数据框,以使用没有空值的绘图
sc_income = df[~df['Economic Need Index'].isnull() & ~df['School Income Estimate'].isnull()]
然后我用
#make plot bigger
plt.rcParams['figure.figsize'] = (40,20)
#plot Economic Need Index vs School Income Estimate
scatter(sc_income['Economic Need Index'], sc_income['School Income Estimate'], c=sc_income['color-code'])
plt.xlabel('Economic Need')
plt.ylabel('School Income $')
plt.title('Economic Need vs. School Income')
plt.legend()
plt.show()
最终情节看起来像这样
需要的图例应指定蓝色表示社区学校,红色表示不是社区学校。
答案 0 :(得分:1)
您尝试按组着色点。有很多方法可以做到这一点。使用matplotlib
:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# generate data
n_obs = 100
df = pd.DataFrame({'Community School?': np.random.choice(['Yes', 'No'], size=n_obs),
'Economic Need Index': np.random.uniform(size=n_obs),
'School Income Estimate': np.random.normal(loc=n_obs, size=n_obs)})
# your data pre-processing steps
df['color-code'] = np.where(df['Community School?']=='Yes', 'blue', 'red')
sc_income = df[~df['Economic Need Index'].isnull() & ~df['School Income Estimate'].isnull()]
# plot Economic Need Index vs School Income Estimate by group
groups = sc_income.groupby('Community School?')
fig, ax = plt.subplots(1, figsize=(40,20))
for label, group in groups:
ax.scatter(group['Economic Need Index'], group['School Income Estimate'],
c=group['color-code'], label=label)
ax.set(xlabel='Economic Need', ylabel='School Income $',
title='Economic Need vs. School Income')
ax.legend(title='Community School?')
plt.show()
或者使用seaborn
和pairplot
例如:
g = sns.pairplot(x_vars='Economic Need Index', y_vars='School Income Estimate', data=sc_income,
hue="Community School?", size=5)
g.set(xlabel='Economic Need', ylabel='School Income $', title='Economic Need vs. School Income')