DeDuplicating对象数组javascript

时间:2018-05-23 16:30:24

标签: javascript arrays

处理一个涉及5个类似SQL数据库的项目,我需要检测并过滤掉重复项。我想我走在正确的轨道上,但我还没到那里。我试图按照以下步骤来实现这一目标:

  1. 为传递.forEach()对象的主数组启动item
  2. 通过let filtered = Array.filter(x => x.id !== item.id);创建一个过滤后的数组,以防止自我检查。
  3. 为传递.forEach()作为参数的已过滤数组启动comparison
  4. 在姓名,电话和电子邮件字段中初始化相似性的变量。(例如nameSimilarityphoneSimilarityemailSimilarity
  5. 如果item.emailcomparison.email不为空,请比较字符串并将相似度百分比存储在emailSimilarity其他emailSimilarity=0中。
  6. 如果item.phonecomparison.phone不为空,请比较字符串并将相似度百分比存储在phoneSimilarity其他phoneSimilarity=0中。
  7. item.firstNameitem.lastName合并到名为itemFullName的变量中,并将comparison.firstNamecomparison.lastName合并到名为comparisonFullName的变量中。
  8. 如果itemFullNamecomparisonFullName不为空,请比较字符串并将相似度百分比存储在nameSimilarity其他nameSimilarity=0中。
  9. 如果emailSimilaritynameSimilarityphoneSimilarity中的任何百分比,将item加上相似变量和comparison.id添加到重复项中数组,并将其拼接出原始数组。

    这是我为遵循这些步骤编写的代码,但似乎我在duplicates数组中获得了重复的条目。我不确定为什么它没有按预期工作,但我有一种预感,我不能指望原始数组在forEach()操作中发生变异。

    fullArray.forEach(item => {
        let filtered = fullArray.filter(x => x.externalId !== item.externalId);
        filtered.forEach(comparison => {
            let emailSimilarity, phoneSimilarity, nameSimilarity;
            if ((item.email !== '') && (comparison.email !== '')) {
                emailSimilarity = strcmp.jaro(item.email, comparison.email);
            } else {
                emailSimilarity = 0;
            }
            if ((item.phone !== '') && (comparison.phone !== '')) {
                phoneSimilarity = strcmp.jaro(item.phone, comparison.phone);
            } else {
                phoneSimilarity = 0;
            }
            let itemFullName = `${item.firstName} ${item.LastName}`.trim() || '';
            let comparisonFullName = `${comparison.firstName} ${comparison.LastName}`.trim();
            if (((itemFullName !== '') && (comparisonFullName !== '')) || ((itemFullName.indexOf('Group')! > 0) && (comparisonFullName.indexOf('Group') !>0))) {
                nameSimilarity = strcmp.jaro(itemFullName, comparisonFullName);
            } else {
                nameSimilarity = 0;
            }
            if ((emailSimilarity || phoneSimilarity || nameSimilarity) > 0.89) {
    
            let dupesOutput = Object.assign({}, item, { similarName: nameSimilarity, similarEmail: emailSimilarity, similarPhone: phoneSimilarity, similarTo: comparison.externalId });
            dupes.push(dupesOutput);
            fullArray = fullArray.filter(x => x.externalId !== item.externalId);
        }
    });
    

    });

  10. 问题出在哪里?

1 个答案:

答案 0 :(得分:2)

假设相似性检查有效,问题是您在将新数组重新分配给fullArray时仍处于旧数组的forEach循环中。

我建议你使用Array.filter

var filteredArray = fullArray.filter(item => {
    return !fullArray.some(comparison => {
        if(comparison.externalId==item.externalId) 
            return false;

        let emailSimilarity, phoneSimilarity, nameSimilarity;
        if ((item.email !== '') && (comparison.email !== '')) {
            emailSimilarity = strcmp.jaro(item.email, comparison.email);
        } else {
            emailSimilarity = 0;
        }
        if ((item.phone !== '') && (comparison.phone !== '')) {
            phoneSimilarity = strcmp.jaro(item.phone, comparison.phone);
        } else {
            phoneSimilarity = 0;
        }
        let itemFullName = `${item.firstName} ${item.LastName}`.trim() || '';
        let comparisonFullName = `${comparison.firstName} ${comparison.LastName}`.trim();
        if (((itemFullName !== '') && (comparisonFullName !== '')) || ((itemFullName.indexOf('Group')! > 0) && (comparisonFullName.indexOf('Group') !>0))) {
            nameSimilarity = strcmp.jaro(itemFullName, comparisonFullName);
        } else {
            nameSimilarity = 0;
        }
        if ((emailSimilarity || phoneSimilarity || nameSimilarity) > 0.89) {
            let dupesOutput = Object.assign({}, item, { similarName: nameSimilarity, similarEmail: emailSimilarity, similarPhone: phoneSimilarity, similarTo: comparison.externalId });
            dupes.push(dupesOutput);
            return true;
        }else
            return false;
    });
});