Question

以下是我正在使用的正则表达式的最新版本，它抛出错误“无效的正则表达式。”

非常感谢任何具有正则表达式格式的foo！

以下是我的代码：

// This function gets all the text in browser
function getText() {
    return document.body.innerText;
}
var allText = getText(); // stores into browser text into variable

//regex set to rid text of all punctuaction, symbols, numbers, and excess  spaces
var matcher = new RegExp ("/(?<!\w)[a-zA-Z]+(?!\w)/", "g");

//cleanses text in browser of punctuation, symbols, numbers, and excess spaces
var newWords = allText.match(matcher);

//using a single space as the dividing tool, creates a list of all words
var Words=newWords.split(" ");

Answer 1

而不是

//regex set to rid text of all punctuaction, symbols, numbers, and excess  spaces
var matcher = new RegExp ("/(?<!\w)[a-zA-Z]+(?!\w)/", "g");
//cleanses text in browser of punctuation, symbols, numbers, and excess spaces
var newWords = allText.match(matcher);
//using a single space as the dividing tool, creates a list of all words
var Words=newWords.split(" ");

只需使用

var Words = allText.match(/\b[a-zA-Z]+\b/g); // OR...
// var Words = allText.match(/\b[A-Z]+\b/ig);

这将为您提供仅包含ASCII字母的所有“单词”String#match以及基于/g的正则表达式将获取与正则表达式匹配的所有子字符串（匹配1个或更多ASCII字母之间）字边界）。

JS不支持lookbehind（即(?<!)或(?<=)构造），这里需要一个单词边界\b。

请注意，您需要.replace(/\W+/g, ' ')来删除所有标点符号，符号，数字和多余空格的文本，但似乎您可以依赖{{1} }。

正则表达式错误

1 个答案: