Question

我正在尝试学习如何在Python中使用正则表达式。我想从一个看起来像这样的字符串中检索一个ID号（在括号中）：

"This is a string of variable length (561401)"

ID号（本例中为561401）可以是可变长度的，文本也是如此。

"This is another string of variable length (99521199)"

我的编码失败了：

import re
import selenium

# [Code omitted here, I use selenium to navigate a web page]

result = driver.find_element_by_class_name("class_name")
print result.text # [This correctly prints the whole string "This is a text of variable length (561401)"]

id = re.findall("??????", result.text) # [Not sure what to do here]
print id

Answer 1

这适用于您的示例：

(?<=\()[0-9]*

?<=匹配您正在寻找的群组之前的某些内容，但不会消费它。在这种情况下，我使用\(。（是一个特殊字符，因此必须使用\进行转义。[0-9]匹配任何数字。*表示匹配任意数量的前一个规则，因此[0-9]*意味着匹配尽可能多的数字。

Answer 2

解决了这个感谢Kaz的链接，非常有用：

http://regex101.com/

Contact

Answer 3

您可以使用这个简单的解决方案：

>>> originString = "This is a string of variable length (561401)"  
>>> str1=OriginalString.replace("("," ")
'This is a string of variable length  561401)'
>>> str2=str1.replace(")"," ")
'This is a string of variable length  561401 '
>>> [int(s) for s in string.split() if s.isdigit()]
[561401]

首先，我用空格替换parantheses。然后我在新字符串中搜索整数。

Answer 4

这里不需要真正使用正则表达式，如果它总是在最后并且总是在括号中你可以拆分，提取最后一个元素并通过取子串（[1：-1]）来删除括号。正则表达式相对耗时。

line = "This is another string of variable length (99521199)"
print line.split()[-1][1:-1]

如果您确实想使用正则表达式，我会这样做：

import re
line = "This is another string of variable length (99521199)"
id_match = re.match('.*\((\d+)\)',line)
if id_match:
    print id_match.group(1)

检索字符串的一部分，可变长度

4 个答案: