Question

如果给定的单词被数字包围，我需要将它们分开。例如，单词是“ x”。

s = '''
1x 3    # OK
s1x2    # WRONG
2x      # OK
s1 x2   # WRONG
x2      # OK
1sx3    # WRONG
'''

print(re.sub("(?<=\d)\s*x\s*(?=\d)", " x ", s))

这会分隔所有内容，即使周围的数字不是数字，我的意思是s1 x2或s1x3x都不应该匹配。

另一方面，它不适用于“否”-仅适用于最后两行：

s = '''
2 no 3  # OK (but it's not needed to match)
2no     # OK
3no2    # OK
no9     # OK
xno9    # WRONG
5 non   # WRONG (for 'no')
'''

print(re.sub("(?<=\d)\s*no\s*(?=\d)", " x ", s))

我已经编辑了一些示例。有必要在句子中使用它，例如：

在几片土地上竖立1个3卧室小屋平房和1 x 2卧室平房。安装2个不发光的仪表板标牌和2个无广告迹象。

第一个句子中的两个应该匹配，第二个句子中的第二个应该匹配。

编辑

由于下面的帖子，我发现这个要匹配：

\b(?:\d*\s*x\s*\d+|\d+\s*x\s*\d*)\b

，但问题是它不能用于替换。这个想法是为数字包围的单词增加一个额外的空间。因此，尽管此模式现在可以正确选择那些短语（从单行和句子中选择），但它不能与替换一起使用，因为它只应与这些单词匹配：

s = "Sever land and erect 1x 3 Bedroom chalet bungalow and 1x2 Bedroom bungalow"

re.sub("\b(?:\d*\s*x\s*\d+|\d+\s*x\s*\d*)\b", " x ", s, flags=re.IGNORECASE)

Answer 1

您可以在x或n o可以在中间匹配的任一侧使用^(?:\d* *(?:x|no)\s*\d+|\d+\s*(?:x|no) *\d*)$和std::future来匹配所需的数字。

std::thread

alternation

Answer 2

data = '''
Sever land and erect 1x 3 Bedroom chalet bungalow and 1x2 bedroom bungalow. Installation of 2 non-illuminated fascia signs and 2no ad signs.
'''

cases = ['no', 'nos', 'x']

import re

l = data
for case in cases:
    l = re.sub(r'\s{2,}', ' ', re.sub(r'(?<=\d| ){}(?=\d| )'.format(case), r' {} '.format(case), l))

print(l)

打印：

Sever land and erect 1 x 3 Bedroom chalet bungalow and 1 x 2 bedroom bungalow. Installation of 2 non-illuminated fascia signs and 2 no ad signs.

如何使用正则表达式将数字与给定单词分开？

2 个答案: