Seaborn 热图 colobar:如何确保类的正确顺序和显示的正确颜色

时间:2021-01-02 09:29:54

标签: seaborn heatmap colorbar

我有一个数据框,其中包含某个计算的结果,我想将其绘制为带有颜色条的 seaborn 热图。我正在使用以下代码来实现这一点(主要取自此处:enter link description here):

# input data
results = [['equal','equal','smaller','smaller or equal','greater or equal'],   
           ['equal','equal','smaller','smaller','greater or equal'],                                      
           ['greater','equal','smaller or equal','smaller','smaller'],
           ['equal','smaller or equal','greater or equal','greater or equal','equal'],
           ['equal','equal','smaller','equal','equal']]

index = ['axc', 'org', 'cf5', 'cm1', 'ext']
columns = ['axc', 'org', 'cf5', 'cm1', 'ext']

# create a dataframe
res_df = pd.DataFrame(results, columns, index) 

value_to_int = {j:i for i,j in enumerate(['greater','greater or equal','equal','smaller or equal','smaller'])}

n = len(value_to_int)     

# discrete colormap (n samples from a given cmap)
cmap = sns.color_palette("viridis", n) 
ax = sns.heatmap(res_df.replace(value_to_int), cmap=cmap) 

# modify colorbar:
colorbar = ax.collections[0].colorbar 
r = colorbar.vmax - colorbar.vmin 
colorbar.set_ticks([colorbar.vmin + r / n * (0.5 + i) for i in range(n)])
colorbar.set_ticklabels(list(value_to_int.keys()))                                          
plt.show()

enter image description here

它在大多数情况下就像一个魅力,但如果索引列表中的一个类不存在,就会出现问题。为了演示,如果您像这样更改数据框:

results_changed = [['equal','equal','smaller','smaller or equal','greater or equal'],
              ['equal','equal','smaller','smaller','greater or equal'],
              ['greater or equal','equal','smaller or equal','smaller','smaller'],
              ['equal','smaller or equal','greater or equal','greater or equal','equal'],
              ['equal','equal','smaller','equal','equal']]

index = ['axc', 'org', 'cf5', 'cm1', 'ext']
columns = ['axc', 'org', 'cf5', 'cm1', 'ext']

# create a dataframe
res_df = pd.DataFrame(results_changed, columns, index) 

value_to_int = {j:i for i,j in enumerate(['greater','greater or equal','equal','smaller or equal','smaller'])}

n = len(value_to_int)  

# discrete colormap (n samples from a given cmap)
cmap = sns.color_palette("viridis", n) 
ax = sns.heatmap(res_df.replace(value_to_int), cmap=cmap) 

# modify colorbar:
colorbar = ax.collections[0].colorbar 
r = colorbar.vmax - colorbar.vmin 
colorbar.set_ticks([colorbar.vmin + r / n * (0.5 + i) for i in range(n)])
colorbar.set_ticklabels(list(value_to_int.keys()))                                          
plt.show()  

继续绘图,生成的热图将为类分配错误的颜色——因为现在没有“更大”的情况,它会“移动”调色板,并且不会像以前那样为相等分配正确的颜色。< /p>

enter image description here

我试图通过更改代码中的这一行来解决这个问题:

value_to_int = {j:i for i,j in enumerate(pd.unique(res_df.values.ravel()))}

虽然它解决了颜色分配问题,但它产生了另一个问题,因为颜色条会弄乱颜色的顺序(我想避免这种情况)。

enter image description here

谁能建议如何解决这个问题?如果您有任何建议,我将不胜感激。

1 个答案:

答案 0 :(得分:1)

确保在不同条件下具有可比性的最佳方法是始终将颜色条限制在相同的水平:

import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns

results_changed = [['equal','equal','smaller','smaller or equal','greater or equal'],
              ['equal','equal','smaller','smaller','greater or equal'],
              ['greater or equal','equal','smaller or equal','smaller','smaller'],
              ['equal','smaller or equal','greater or equal','greater or equal','equal'],
              ['equal','equal','smaller','equal','equal']]

index = ['axc', 'org', 'cf5', 'cm1', 'ext']
columns = ['axc', 'org', 'cf5', 'cm1', 'ext']

# create a dataframe
res_df = pd.DataFrame(results_changed, columns, index) 

#construct dictionary from ordered list
category_order = ['greater', 'greater or equal', 'equal', 'smaller or equal', 'smaller']    
value_to_int = {j:i for i,j in enumerate(category_order)}    
n = len(value_to_int)  

# discrete colormap (n samples from a given cmap)
cmap = sns.color_palette("viridis", n) 
ax = sns.heatmap(res_df.replace(value_to_int), cmap=cmap, vmin=0, vmax=n) 

#modify colorbar:
colorbar = ax.collections[0].colorbar 
colorbar.set_ticks([0.5 + i for i in range(n)])
colorbar.set_ticklabels(category_order)                                          
plt.show()  

示例输出:

enter image description here

如果您只想在颜色栏中显示实际存在的颜色,您可以对现有类别的列表进行预过滤,但这会改变不同输入数组的配色方案,使它们难以比较。

import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
import numpy as np

results_changed = [['equal','equal','smaller','smaller or equal','greater'],
              ['equal','equal','smaller','smaller','greater'],
              ['greater','equal','smaller','smaller','smaller'],
              ['equal','smaller','greater','greater','equal'],
              ['equal','equal','smaller','equal','equal']]

index = ['axc', 'org', 'cf5', 'cm1', 'ext']
columns = ['axc', 'org', 'cf5', 'cm1', 'ext']

# create a dataframe
res_df = pd.DataFrame(results_changed, columns, index) 

unique_results = np.unique(results_changed)
unique_categories = [cat for cat in ['greater','greater or equal','equal','smaller or equal','smaller'] if cat in unique_results]

value_to_int = {j:i for i,j in enumerate(unique_categories)}

n = len(value_to_int)  

# discrete colormap (n samples from a given cmap)
cmap = sns.color_palette("viridis", n) 
ax = sns.heatmap(res_df.replace(value_to_int), cmap=cmap) 

#modify colorbar:
colorbar = ax.collections[0].colorbar 
r = colorbar.vmax - colorbar.vmin 
colorbar.set_ticks([colorbar.vmin + r / n * (0.5 + i) for i in range(n)])
colorbar.set_ticklabels(unique_categories)
plt.show()  

示例输出:

enter image description here

相关问题