将图例添加到散点图以区分颜色?

时间:2016-05-05 11:49:07

标签: python pandas matplotlib

我正在使用Pandas 0.18。我有这样的数据框:

code    proportion    percent_highcost    total_quantity
A81     0.7           76                  1002
A81     0.0           73                  1400

我正在绘制一个像这样的散点图:

colours = np.where(df['proportion'] > 0, 'r', 'b')  
df.plot.scatter(y='percent_highcost', x='total_quantity', c=colours)

这很有效,但我不知道如何添加图例以指示两种颜色的含义。

我已经尝试了plt.legend(['Non-dispensing', 'dispensing'], loc=1),但这会产生一个奇怪的结果 - 我想因为那里只有一个系列:

enter image description here

有人可以提供建议吗?

1 个答案:

答案 0 :(得分:0)

在同一轴上绘制唯一DataFrame

在散点图中绘制多个系列(不是pandas Series)可以通过将DataFrame s除以条件然后将它们绘制为单独来完成在同一轴上使用唯一颜色进行散布。这显示在answer中。我将在这里使用您的数据重现它。

注意:这是在iPython / Jupyter笔记本中完成的

%matplotlib inline

import pandas as pd
from cStringIO import StringIO  

# example data
text = '''
code    proportion    percent_highcost    total_quantity
A81     0.7           76                  1002
A81     0.0           73                  1400
A81     0.1           77                  1300
A81     0.0           74                  1200
A81     -0.1          78                  1350
'''

# read in example data
df = pd.read_csv(StringIO(text), sep='\s+')

print 'Original DataFrame:'
print df
print

# split the DataFrame into two DataFrames
condition = df['proportion'] > 0
df1 = df[condition].dropna()
df2 = df[~condition].dropna()

print 'DataFrame 1:'
print df1
print

print 'DataFrame 2:'
print df2
print

# Plot 2 DataFrames on one axis
ax = df1.plot(kind='scatter', x='total_quantity', y='percent_highcost', c='b', s=100, label='Non-Dispensing')
df2.plot(kind='scatter', x='total_quantity', y='percent_highcost', c='r', s=100, label='Dispensing', ax=ax)
Original DataFrame:
  code  proportion  percent_highcost  total_quantity
0  A81         0.7                76            1002
1  A81         0.0                73            1400
2  A81         0.1                77            1300
3  A81         0.0                74            1200
4  A81        -0.1                78            1350

DataFrame 1:
  code  proportion  percent_highcost  total_quantity
0  A81         0.7                76            1002
2  A81         0.1                77            1300

DataFrame 2:
  code  proportion  percent_highcost  total_quantity
1  A81         0.0                73            1400
3  A81         0.0                74            1200
4  A81        -0.1                78            1350

Two Series Scatter Plot