在Sklearn中为excel输出着色

时间:2018-03-08 07:02:49

标签: python pandas dataframe colors scikit-learn

我想基于单个列(我的数据框中的max_probabilities列)对导出的输出数据帧(输出格式:excel文件)的每一行进行条件格式化。 如果max_probabilities中的概率大于0.75我希望特定的整行被着色为绿色,否则它必须被涂成红色。 我该怎么做。(注意:我想为导出的excel行着色而不是数据帧) 数据帧格式代码:

df=pd.DataFrame({'Details':x_test,'Amount':test_data.xn_Amount,'Category':Classified_Category,'Probability':max_probabilities})

这就是我现在导出的数据框的样子。

enter image description here

由于

1 个答案:

答案 0 :(得分:3)

使用conditional formats,但它只为列着色:

import string

df = pd.DataFrame({'Amount':[1,2,3],
                   'max_probabilities':[.1,2,.3]})
print (df)
   Amount  max_probabilities
0       1                0.1
1       2                2.0
2       3                0.3
writer = pd.ExcelWriter('pandas_conditional.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
workbook  = writer.book
worksheet = writer.sheets['Sheet1']
red_format = workbook.add_format({'bg_color':'red'})
green_format = workbook.add_format({'bg_color':'green'})

#dict for map excel header, first A is index, so omit it
d = dict(zip(range(25), list(string.ascii_uppercase)[1:]))
#print (d)

col = 'max_probabilities'
excel_header = str(d[df.columns.get_loc(col)])
#get length of df
len_df = str(len(df.index) + 1)
rng = excel_header + '2:' + excel_header + len_df
print (rng)
C2:C4

worksheet.conditional_format(rng, {'type': 'cell',
                                      'criteria': '<',
                                       'value':     0.75,
                                       'format': red_format})

worksheet.conditional_format(rng, {'type': 'cell',
                                      'criteria': '>=',
                                       'value':   0.75,
                                       'format':  green_format})
writer.save()

如果想要着色行:

df = pd.DataFrame({'Amount':[1,2,3],
                   'Category':['a','d','f'],
                   'max_probabilities':[.1,2,.3]})
print (df)
   Amount Category  max_probabilities
0       1        a                0.1
1       2        d                2.0
2       3        f                0.3

def highlight(x):
    c1 = 'background-color: green'
    c2 = 'background-color: red' 
    #if want set no default colors 
    #c2 = ''  
    m = x['max_probabilities'] > .75
    df1 = pd.DataFrame(c2, index=x.index, columns=x.columns)
    df1.loc[m, :] = c1
    return df1

df.style.apply(highlight, axis=None).to_excel('styled.xlsx', engine='openpyxl')