Question

我想找到所有数字：

在＆＃39;＃＆＃39;
一开始

例如，

>>> s = '1___2___#3___@4___##5'

结果应为

['1', '3', '5']

我现在拥有的是

>>> re.findall('#(\d)', s)          # ['3', '5']

和

>>> re.findall('^(\d)', s)          # ['1']

但我不知道如何将它们组合成一个正则表达式。谢谢你的帮助。

Answer 1

保持简单......

re.findall首先选择捕获组。因此，将前面的^（起始锚点）和#置于非捕获组中。

>>> s = '1___2___#3___@4___##5'
>>> re.findall('(?:^|#)(\d+)', s)
['1', '3', '5']

或

更简单..

>>> s = '1___2___#3___@4___##5' >>> re.findall('(?<![^#])\d+', s) ['1', '3', '5']

DEMO

以上是正则表达式的工作原理......

(?<!.)\d+匹配所有不带字符的数字（除了换行符）。所以这必须与开头时出现的数字相匹配，因为在开始时只满足这个条件。

(?<![^#])\d+再一步，这个正则表达式将匹配开始时出现的数字，因为[^#]消耗了一个字符，它也匹配所有不在前面的数字由不属于#的角色。

Answer 2

^\d+|(?<=#)\d+

你可以尝试一下。参见演示。

https://regex101.com/r/sH8aR8/51

使用

re.findall('^\d+|(?<=#)\d+', s)

使用0 width assertions仅捕获您需要的内容。

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  ^                        the beginning of the string
--------------------------------------------------------------------------------
  \d+                      digits (0-9) (1 or more times (matching
                       the most amount possible))
--------------------------------------------------------------------------------
 |                        OR
--------------------------------------------------------------------------------
  (?<=                     look behind to see if there is:
--------------------------------------------------------------------------------
    #                        '#'
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  \d+                      digits (0-9) (1 or more times (matching
                       the most amount possible))

是否有匹配模式或字符串开头的好方法？

2 个答案: