使用正则表达式id: ([a-z]|[A-Z]+)\\w*
,我可以识别以字母开头的所有标识符。有没有办法使用单个正则表达式来排除某些特定标识符(例如编程语言中的关键字)?
我有以下输入行的图片:
汽车zed var for the airplane
和var
for
和while
是我的编程语言的关键字。正则表达式应仅与car
,zed
和airplane
匹配。
这可能吗?非常感谢提前!
答案 0 :(得分:2)
用grep测试:
kent$ echo "car zed var for while airplane"|grep -Po '(?!\bfor|\bwhile|\bvar)\b\w+'
car
zed
airplane
答案 1 :(得分:1)
使用单词锚点和替换:
\b(var|for|while)\b
这只能匹配您编写的完全相同的关键字。
编辑:完全误读了您的问题:
Regex regexObj = new Regex(@"\b(?!(?:for|var|while)\b)\w+\b");
Match matchResults = regexObj.Match(subjectString);
while (matchResults.Success) {
// matched text: matchResults.Value
// match start: matchResults.Index
// match length: matchResults.Length
matchResults = matchResults.NextMatch();
}
<强>解释强>
"
\b # Assert position at a word boundary
(?! # Assert that it is impossible to match the regex below starting at this position (negative lookahead)
(?: # Match the regular expression below
# Match either the regular expression below (attempting the next alternative only if this one fails)
for # Match the characters “for” literally
| # Or match regular expression number 2 below (attempting the next alternative only if this one fails)
var # Match the characters “var” literally
| # Or match regular expression number 3 below (the entire group fails if this one fails to match)
while # Match the characters “while” literally
)
\b # Assert position at a word boundary
)
\w # Match a single character that is a “word character” (letters, digits, etc.)
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\b # Assert position at a word boundary
"