Question

我正试图在网站上搜索公司的位置。我有这个功能：

x=['174 WEST 4TH ST, NYC','All contents © Copyright 2018 Propela']

import re

def is_location(text):
    """Does text contain digits, lowercase and uppercase letters"""
    return all(re.search(pattern, text) for pattern in ['\d{3,16}', '[a-z]*', '[A-Z]'])
# x[1]
# is_location(x[2])

print(list(filter(is_location, x)))

我想使用正则表达式并且只是在数字被提及两次时才会捕获事物，因此在 174 WEST 4TH ST，NYC 中有一组数字174然后另一个单独的数字4。

这可能吗？

Answer 1

您可以使用以下模式匹配字符串中单独的单词中出现的两个数字：

\d+.*\s+.*\d+

以下是示例代码：

line = "174 WEST 4TH ST, NYC";

res = re.search( r'\d+.*\s+.*\d+', line, re.M|re.I)
if res:
    print "found a match: ", res.group()
else:
    print "no match"

使用正则表达式来确保某些内容包含单独的数字

1 个答案: