如何为一组箱图添加单个图例标签?

时间:2018-01-30 10:36:55

标签: python matplotlib plot seaborn boxplot

是否有更好的方法将单个标签添加到一组箱图的图例中?

下面是一个简单的工作示例,可以提供所需的结果。这是通过创建具有所需标签的不可见线(alpha=0),然后通过legendHandles更改alpha来完成的。但是,所有箱图的单个标签是否可以传递给sns.boxplot()

import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Get the tips dataset and select a subset as an example
tips = sns.load_dataset("tips")
variable_to_bin_by = 'tip'
binned_variable = 'total_bill'
df = tips[ [binned_variable,  variable_to_bin_by] ]  

# Group the data by a list of bins
bins = np.array([0, 1, 2, 3, 4])
gdf = df.groupby( pd.cut(df[variable_to_bin_by].values, bins ) )
data = [ i[1][binned_variable].values for i in gdf]
df = pd.DataFrame( data, index = bins[:-1])   

# Plot the data (using boxplots to show spread of real values)
fig, ax = plt.subplots()
ax = sns.boxplot( data=df.T, ax=ax, color='k')

# Create hidden line with the extra label (to give label to boxplots)
x = np.range(10)
plt.plot(x, x, label='REAL DATA', color='k', alpha=0)

# Now plot some "model fit" lines
models = {'model1': bins+10, 'model2': bins+10*1.5, 'model3': bins*10}
for key in sorted( models.keys() ):
    plt.plot( bins, models[key], label=key )

# Add a legend
leg = plt.legend()

# Update line visibility (alpha)
for legobj in leg.legendHandles:
        legobj.set_alpha( 1 )

# Show the plot
plt.show()

虽然这给出了期望的结果(如下所示),但我的问题是,是否有更好的方法?

Success!

1 个答案:

答案 0 :(得分:1)

您可以直接使用要在图例中显示的属性创建一个空行,而不是使用包含某些数据的行(然后需要在图中使其不可见,然后在图例中显示)(此处,颜色)。

plt.plot([], [], label='REAL DATA', color='k')

这可以避免在情节和图例中使用alpha。 完整的例子如下:

import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Get the tips dataset and select a subset as an example
tips = sns.load_dataset("tips")
variable_to_bin_by = 'tip'
binned_variable = 'total_bill'
df = tips[ [binned_variable,  variable_to_bin_by] ]  

# Group the data by a list of bins
bins = np.array([0, 1, 2, 3, 4])
gdf = df.groupby( pd.cut(df[variable_to_bin_by].values, bins ) )
data = [ i[1][binned_variable].values for i in gdf]
df = pd.DataFrame( data, index = bins[:-1])   

# Plot the data (using boxplots to show spread of real values)
fig, ax = plt.subplots()
ax = sns.boxplot( data=df.T, ax=ax, color="grey")

# Create hidden line with the extra label (to give label to boxplots)
plt.plot([], [], label='REAL DATA', color='k')

# Now plot some "model fit" lines
models = {'model1': bins+10, 'model2': bins+10*1.5, 'model3': bins*10}
for key in sorted( models.keys() ):
    plt.plot( bins, models[key], label=key, zorder=3)

# Add a legend
leg = plt.legend()

# Show the plot
plt.show()

enter image description here