我最近开始修改Vega-Lite模板,为称为DVC的开源数据科学软件制作一个混淆矩阵。您可以在my PR here中看到该模板,但我还将在下面重复一个简化版本:
{
...
"data": {
"values": [
{"actual": "Wake", "predicted": "Wake", "rev": "HEAD"},
{"actual": "Wake", "predicted": "Deep", "rev": "HEAD"},
{"actual": "Light", "predicted": "Wake", "rev": "HEAD"},
{"actual": "REM", "predicted": "Light", "rev": "HEAD"},
....
],
},
"spec": {
"transform": [
{
"aggregate": [{"op": "count", "as": "xy_count"}],
"groupby": ["actual", "predicted"],
},
{
"joinaggregate": [
{"op": "max", "field": "xy_count", "as": "max_count"}
],
"groupby": [],
},
{
"calculate": "datum.xy_count / datum.max_count",
"as": "percent_of_max",
},
],
"encoding": {
"x": {"field": "predicted", "type": "nominal", "sort": "ascending"},
"y": {"field": "actual", "type": "nominal", "sort": "ascending"},
},
"layer": [
{
"mark": "rect",
"width": 300,
"height": 300,
"encoding": {
"color": {
"field": "xy_count",
"type": "quantitative",
"title": "",
"scale": {"domainMin": 0, "nice": True},
}
},
},
{
"mark": "text",
"encoding": {
"text": {
"field": "xy_count",
"type": "quantitative"
},
"color": {
"condition": {
"test": "datum.xy_count / datum.max_count > 0.5",
"value": "white"
},
"value": "black"
}
}
}
]
}
}
因此,由于我正在进行groupby聚合,所以混淆矩阵中可能有没有条目的单元格。以下是示例输出:link
如何用“后备”或其他内容填充这些单元格。我还研究了使用数据透视和归因,但还不太清楚。帮助非常感谢:)
答案 0 :(得分:2)
您可以通过在转换序列的末尾添加两个Impute transforms来做到这一点:
{"impute": "xy_count", "groupby": ["actual"], "key": "predicted", "keyvals": ["Deep", "Light", "Wake", "REM"], "value": 0},
{"impute": "xy_count", "groupby": ["predicted"], "key": "actual", "keyvals": ["Deep", "Light", "Wake", "REM"], "value": 0}
keyvals
指定您希望在每个轴上估算哪些缺失值;如果每个关键值中至少存在一个组,则可以将其忽略。