两个字符串“abc”和“acb”之间的 Damerau-Levenshtein 距离将为 1,因为它涉及“b”和“c”之间的一个换位。
> stringdist("abc", "acb", method = "dl")
[1] 1
现在假设我有以下两个字符向量:
A = c("apple", "banana", "citrus")
B = c("apple", "citrus", "banana")
如何计算 A 和 B 之间的 Damerau-Levenshtein 距离,以便结果与“abc”和“acb”之间的距离相同,因为“柑橘”和“香蕉”之间有一个换位?换句话说,我如何计算 A 和 B 之间的 Damerau-Levenshtein 距离,以便每个项目都算作字符串中的一个字符?
答案 0 :(得分:1)
addVariant: function () {
const parts = this.form.attributes.map((attribute) =>
attribute.values.map((value) => {
return { value, key: attribute.title.text };
})
);
const combinations = parts.reduce((a, b) =>
a.reduce((r, v) => r.concat(b.map((w) => [].concat(v, w))), [])
);
let variants = [];
for (const combination of combinations) {
variants.push({
title: `${combination[0].value}/${combination[1].value}`,
slug: "...the value that you want",
options: combination,
});
}
this.variants = variants;
}
答案 1 :(得分:0)
怎么样
vecdist <- function(x, y){
matches <- match(x, y, nomatch = 0)
nomatch <- matches == 0
# No match = we need 1 permutation
# Other matches: Compare index, for each "not inverted" index, (not 3 vs -3) we need 1 permutation
perm_match <- (matches - seq_along(matches))[!nomatch]
perm_n <- sum(perm_match != 0) - sum(duplicated(abs(perm_match)))
sum(nomatch) + perm_n + sum(!y %in% x)
}
这里的基本思想是:
x
与 y
中是否缺少匹配项,反之亦然。每个都是 1 个排列duplicated(abs(...))
检查是否必须“相互”切换任何字段。例如,abcd
, badc
是 2 个排列,而 abcd
, bdca
是 3。这与 stringdist
对单个字符串的工作方式非常相似。
A = c("apple", "banana", "citrus")
B = c("apple", "citrus", "banana")
vecdist(A, B)
[1] 1
A <- c(A, 'pear')
vecdist(A, B)
[1] 2
vecdist(B, A)
[1] 2
A <- c('apple', 'banana', 'citrus', 'pear')
B <- c('pear', 'citrus', 'banana', 'apple')
vecdist(A, B)
[1] 2
vecdist(B, A)
[1] 2
A <- c('apple', 'banana', 'citrus', 'pear')
B <- c('pear', 'citrus', 'apple', 'banana')
vecdist(A, B)
[1] 3
vecdist(B, A)
[1] 3