Question

示例字符串为"CPLR_DUK10_772989_2"。我想特别挑选"772989"。我认为使用re.findall是一个很好的方法，然而，我不能很好地掌握正则表达式，所以我发现自己难以理解这一点。

以下是我认为可行的代码示例，直到我查看完整的字符串列表，并看到它绝对没有。我想我正在寻找更强大的功能！

for ad in Ads:
    num = ''.join(re.findall(numbers,ad)[1:7])
    ID.append(num)
ID = pd.Series(ID)

其他示例字符串："Teb1_110765"，"PAN1_111572_5"。

Answer 1

你正在寻找的正则表达式是

p = re.findall(r'_(\d{6})', ad)

这将匹配以下划线开头的六位数字，并为您提供所有匹配的列表（如果有多个）

<强>演示：

>>> import re
>>> stringy =  'CPLR_DUK10_772989_2'
>>> re.findall(r'_(\d{6})', stringy)
['772989']

Answer 2

这应该附加跟随下划线的所有6个数字

for ad in Ads:
    blocks = re.split('_', ad)
    for block in blocks[1:]:
        if len(block) == 6 and block.isdigit(): 
            ID.append(block)
ID = pd.Series(ID)

Answer 3

您可以使用列表理解：

>>> s="CPLR_DUK10_772989_2"
>>> [x for x in s.split('_') if len(x)==6 and x.isdigit()]
['772989']

如果您的字符串非常长并且您只需要查找一个数字，则可以使用如下的intertool：

>>> from itertools import dropwhile
>>> next(dropwhile(lambda x: not(len(x)==6 and x.isdigit()), s.split('_')))
'772989'

如何在python中找到一个字符串中的6位数？

3 个答案: