我有一个看起来像这样的字符串
someString = "#3Hello there! How many #4candies did you sell today? Do have any #4candies left?"
lookupDict = {"Hello there": "#3", "candies": "#4"}
现在,我想用someString
替换字符串#0
中不在字典lookupDict
中的所有术语。我不能用空格" "
分割,因为这会使诸如Hello there
之类的某些术语以两个不同的词Hello
和there
出现,并且永远不会符合我的条件。< / p>
现在,我知道要应用基本的正则表达式,该正则表达式将在每个单词的前面添加一个#0
。例如类似
let regex = /(\b\w+\b)/g;
someString = someString.replace(regex, '#0$1'));
但是那样会盲目地将#0
添加到每个术语中,而不会在字典lookupDict
中查找。
有什么方法可以将正则表达式与字典中的查找结合起来并相应地分配#0
吗?基本上,最终结果将类似于
someString = "#3Hello there! #0How #0many #4candies #0did #0you #0sell #0today? #0Do #0have #0any #4candies #0left?"
注意:此处的空格可以视为单词边界。
答案 0 :(得分:1)
通过这种方式,不必担心lookupDict键的长度或其他任何事情:
let someString =
"#3Hello there! How many #4candies did you sell today? #3Hello there! Do have any #4candies left?#3Hello there! #7John Doe! some other text with having #7John Doe person again";
const lookupDict = { "Hello there": "#3", candies: "#4", "John Doe": "#7" };
Object.keys(lookupDict).map((key, i) => {
const regex = new RegExp(key, "g");
someString = someString.replace(regex, lookupDict[key]); // replace each key to the value: Hello world => #3
});
someString = someString.replace(/ /gi, " #0"); // replace each space
Object.keys(lookupDict).map((key, i) => {
const regex = new RegExp(lookupDict[key] + lookupDict[key], "g");
someString = someString.replace(regex, `${lookupDict[key]}${key}`); // role back the value to key+value
});
someString = someString.replace(/#0#/gi, "#"); // replace #0 for each lookupDict key value
console.log(someString, '<TheResult/>');
答案 1 :(得分:1)
您可以使用以下逻辑:
value
和key
串联的子字符串数组#0
添加到匹配值中。这里是实现:
let someString = "#3Hello there! How many #4candies did you sell today? Do have any #4candies left? #0how #0much";
const lookupDict = {"Hello there": "#3", "candies": "#4", "how": "#0", "much": "#0"};
let patternDict = []; // Substrings to skip
for (var key in lookupDict) {
patternDict.push( `${lookupDict[key]}${key}` ); // Values + keys
}
patternDict.sort(function(a, b){ // Sorting by length, descending
return b.length - a.length;
});
var rx = new RegExp("(?:^|\\W)(" + patternDict.map(function(m) { // Building the final pattern
return m.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');}
).join("|") + ")(?!\\w)|\\S+", "gi");
// rx = /(?:^|\W)(#3Hello there|#4candies|#0much|#0how)(?!\w)|\S+/gi
someString = someString.replace(rx, (x, y) => y ? x : `#0${x}` );
console.log(someString);
// => #3Hello there! #0How #0many #4candies #0did #0you #0sell #0today? #0Do #0have #0any #4candies #0left? #0how #0much
正则表达式看起来像
/(?:^|\W)(#3Hello there|#4candies|#0much|#0how)(?!\w)|\S+/gi
请参见regex demo(已选择PHP选项以绿色突出显示组)。
详细信息
(?:^|\W)
-一个非捕获组,它匹配字符串(^
或(|
)的任何非单词char(=除ASCII字母,数字之外的char)或_
)(#3Hello there|#4candies|#0much|#0how)
-捕获与任何lookupDict
串联值+键相匹配的组1 (?!\w)
-如果在当前位置的右侧紧邻有单词char,则否定超前行为将使匹配失败|
-或\S+
-1个以上非空格字符。答案 2 :(得分:-1)
您可以将函数作为第二个参数传递给.replace
并检查字典中的匹配令牌
我将正则表达式更改为不包含#
的结果
Hello there
有问题,一个学期可以持续多久?最多2个字?
someString = "#3Hello there! How many #4candies did you sell today? Do have any #4candies left?"
let regex = /(?<!#)(\b\w+\b)/g;
someString = someString.replace(regex, x => {
// check x in dict
return `#0${x}`
});
console.log(someString)