Tl; dr:我需要能够比较两个不同长度的字符串,并挑选出大于3个字符的匹配项。任何大于3个字符的匹配都很可能是完整的单词。我正在使用angularJs,jQuery和TypeScript,如果这对任何人都有帮助的话。
我需要能够比较两个字符串,并注意长度超过3个字符的匹配。这将在名称,地址和电子邮件地址上完成。
基本上,网页在此部分网页中的工作方式是,有一个用户被怀疑是另一个用户的重复帐户。标记了可疑副本,Web应用程序用户查看标记的人以及可能是原始帐户的人员列表。
已标记用户的部分内容以及可能匹配的内容会在匹配的字词上突出显示其名称,地址和电子邮件。为了引导网络应用程序用户,他们可以更好地一目了然地了解特定人员被拉的原因。
例如,如果已标记的用户有一封标题为
的电子邮件JohnDoe1234@gmail.com
潜在的第一名,可能是3,有一封标题为
的电子邮件JohnathanDoe@yahoo.com
然后应该突出显示John和Doe这两个词,因此决定被标记用户是否是一个难以理解的帐户的人可以一目了然地看到被标记的用户只是创建了一个新的电子邮件,其中包含他们的第一个缩短版本名称。电子邮件密切匹配,名字和姓氏匹配,地址相同,它们是重复的,否认他们的帐户。
承担这项任务,这就是我为此所做的工作。
//populate flaggedStringArr with flaggedAssessor email
stringPopPos = 0;
for (var i: number = 0; i < this.flaggedAssessor.email.length; i++) {
flaggedStringArr.push(this.flaggedAssessor.email.charAt(stringPopPos));
stringPopPos++;
}
//populate matchedStringArr with assessor email
stringPopPos = 0;
for (var i: number = 0; i < this.assessor.email.length; i++) {
matchedStringArr.push(this.assessor.email.charAt(stringPopPos));
stringPopPos++;
}
//Determine which stringArr is longer for next loop
if (flaggedStringArr > matchedStringArr) {
flaggedShortest = false;
} else {
flaggedShortest = true;
}
//compare the two string arrays. As matches are found store the letter that was matched in a new array.
//As matches are found increment a counter. If 3 or more consecutive matches are found, then a possible word has been found and matched.
//Loop duration is based off the shorter of the two string arrays
for (var i: number = 0; i < (flaggedShortest ? flaggedStringArr.length : matchedStringArr.length); i++) {
if (flaggedStringArr[i] === matchedStringArr[i]) {
storedChars.push(flaggedStringArr[i]);
matchCounter++;
} else {
matchLog.push(matchCounter);
matchCounter = 0;
}
}
//matchLog has one entry that was not pushed because loop ends
//push that entry
matchLog.push(matchCounter);
//Begin searching the match log for matches
for (var i: number = 0; i < matchLog.length; i++) {
//search for matchLog values that are greater than 3
//while searching if one of the logs values is not greater than 3, add it to a counter
//when a match is found, use the counter to start assembling a string at the appropriate location
if (matchLog[i] > 3) {
//a match has been found assemble a regex to search
//use stringTraverse to know where to start in the storedChars array
//use the value in the matchLog at position i to tell you how far along the string to go
for (var ni: number = 0; ni < (matchLog[i]); ni++) {
regExSearch += storedChars[stringTraverse + ni];
}
//The loop has constructed the string
//add the regex search term to the regexArr for later
regexArr.push(regExSearch);
//Empty regexSearch param for next match search
regExSearch = '';
//add matchLog's current value to string array in case there are future matches beyond this one
stringTraverse += matchLog[i];
} else {
stringTraverse += matchLog[i];
}
}
$('#emailToSwap').attr('id', `emailNumber${this.ownIndex}`);
//Now out of the loop cycle through and apply styles to the regex search terms
for (var i: number = 0; i < regexArr.length; i++) {
var regexObj: RegExp = new RegExp(regexArr[i], 'g');
$(`#emailNumber${this.ownIndex}`).html(
function (index, h) {
return h.replace(regexObj, `<span class="dedupe-match">${regexArr[i]}</span>`);
}
);
}
然后我意识到问题是,这只适用于长度完全相同的字符串,所以:
JohnDoe@gmail.com
和
JohnRoe@gmail.com
将匹配John和oe@gmail.com
但如果我逐一抵消,即
JohnDoe@gmail.com
和
SJohnRoe@gmail.com
现在我的代码绝对没有匹配,因为它每个字符串一次比较一个字符。所以现在我不知道如何继续。
我正在使用angularJs,jQuery和Typescript,如果这对任何人都是有用的信息。