我计算单词/句子之间的距离并通过scipy linkage函数运行它们,但我需要知道如何将它与原始输入相关联。即由于联动功能不接受,我在途中丢失了标签。
TL;博士; 我不知道如何将我的标签(var X)与联系函数的输出联系起来。
X = [
"the weather is good",
"it is a rainy day",
"it is raining today",
"This has something to do with today",
"This has something to do with tomorrow",
]
# my magic function
result_set = [['this has something to do with today', 'this has something to do with tomorrow', 0.95044514149501169],
['this has something to do with today', 'it is a rainy day', 0.27315656750393491],
['this has something to do with today', 'it is raining today', 0.21404567560988952],
['this has something to do with today', 'the weather is good', 0.12284646267479128],
['this has something to do with tomorrow', 'it is a rainy day', 0.28564020977046212],
['this has something to do with tomorrow', 'it is raining today', 0.19174771483161279],
['this has something to do with tomorrow', 'the weather is good', 0.12920110156248313],
['it is a rainy day', 'it is raining today', 0.54390124565447373],
['it is a rainy day', 'the weather is good', 0.20843820300588964],
['it is raining today', 'the weather is good', 0.19278767792873652]]
sims = np.array(result_set)[:, 2]
sims = ['0.950445141495' '0.273156567504' '0.21404567561' '0.122846462675'
'0.28564020977' '0.191747714832' '0.129201101562' '0.543901245654'
'0.208438203006' '0.192787677929']
Z = linkage(sims, 'ward')
Z = [[ 0. 4. 0.12284646 2. ]
[ 1. 3. 0.19174771 2. ]
[ 2. 5. 0.27143491 3. ]
[ 6. 7. 0.70328415 5. ]]
答案 0 :(得分:2)
事实证明我正在进入距离函数的相似性,因此在反转sim之后结果确实有意义。以下操作正确显示标签
dendrogram(
Z,
labels=X,
orientation="right",
leaf_rotation=0, # rotates the x axis labels
leaf_font_size=8, # font size for the x axis labels
)