Question

我想在'$'

的字符串中找到以单个非字母数字字符开头的单词，例如re.findall

匹配单词的示例

$Python
$foo
$any_word123

非匹配单词的示例

$$Python
foo
foo$bar

为什么`\b`不起作用

如果第一个字符是字母数字，我可以这样做。

re.findall(r'\bA\w+', s)

但这对\b\$\w+这样的模式不起作用，因为\b仅匹配\w和\W之间的空字符串。

# The line below matches only the last '$baz' which is the one that should not be matched
re.findall(r'\b\$\w+', '$foo $bar x$baz').

以上输出['$baz']，但所需的模式应输出['$foo', '$bar']。

我尝试使用模式\b替换^|\s一个正面的背后隐藏，但这不起作用，因为 lookarounds 的长度必须固定。

处理这种模式的正确方法是什么？

Answer 1

一种方法是使用带有非空格元字符\S的负向外观。

s = '$Python $foo foo$bar baz'

re.findall(r'(?<!\S)\$\w+', s) # output: ['$Python', '$foo']

Answer 2

以下内容将匹配以单个非字母数字字符开头的单词。

re.findall(r'''
(?:     # start non-capturing group
  ^         # start of string
  |         # or
  \s        # space character
)       # end non-capturing group
(       # start capturing group
  [^\w\s]   # character that is not a word or space character
  \w+       # one or more word characters
)       # end capturing group
''', s, re.X)

或只是：

re.findall(r'(?:^|\s)([^\w\s]\w+)', s, re.X)

结果：

'$a $b a$c $$d' -> ['$a', '$b']

在非字母数字字符

匹配单词的示例

非匹配单词的示例

为什么`\b`不起作用

2 个答案:

在非字母数字字符

匹配单词的示例

非匹配单词的示例

为什么\b不起作用

2 个答案:

为什么`\b`不起作用