我需要比较两个字符串(或数组),并返回它们的相似度百分比,而不管它们的顺序如何

时间:2019-04-06 06:34:10

标签: javascript arrays string

我目前正在尝试制作一个模块,您可以在该模块上学习古代语言的词汇,为此,我需要一个工具来检查用户的答案是否与数据库中的答案匹配。

我要实现的方式(如果您有更有效的解决方案,请让我知道)是对字符进行计数(它们都是小写的字符串或不带标点的数组),并检查它们的相似性百分比。

有什么办法吗?

我尝试用.match()做某事,但不幸的是效果不佳。

// these are the variables

let p = 'The lazy dog jumps over the quick brown fox. It barked.';
p = p.toLowerCase();
p = p.replace(/\s/g, '');
p = p.replace('.', '');
p = p.replace('.', '');

let a = 'The quick brown fox jumps over the lazy dog. It barked.';
a = a.toLowerCase();
a = a.replace(/\s/g, '');
a = a.replace('.', '');
a = a.replace('.', '');

let c = 'The quick black ostrich jumps over the lazy dog. It barked.';
c = c.toLowerCase();
c = c.replace(/\s/g, '');
c = c.replace('.', '');
c = c.replace('.', '');

// this is what should happen: 

compare(p,a); // should return 100%
compare(p,c); // should return 72% (if my math is correct)

1 个答案:

答案 0 :(得分:1)

您可以对相同的字符进行计数,对于第一个具有渐增字符的字符,对于第二个则通过减少每个计数的相加来求和。

然后返回相似性。

function compare(a, b) {
    var count = {}, delta;
    
    a = clean(a);
    b = clean(b);
    
    getCount(a, count, 1);
    getCount(b, count, -1);

    delta = Object.values(count).reduce((s, v) => s + Math.abs(v), 0);
    
    return (b.length - delta) / a.length;
}

function getCount(string, count = {}, inc = 1) {
    Array.from(string).forEach(c => count[c] = (count[c] || 0) + inc);
    return count;
}

const
    clean = s => s.toLowerCase().replace(/[\s.,]+/g, '');

var p = 'The lazy dog jumps over the quick brown fox. It barked.',
    a = 'The quick brown fox jumps over the lazy dog. It barked.',
    c = 'The quick black ostrich jumps over the lazy dog. It barked.';

console.log(compare(p, a));
console.log(compare(p, c));