Question

我需要在这个特定模式中找到大文本字符串的所有部分：

"\t\t" + number (between 1-999) + "\t\t"

然后用：

替换每个匹配项

TEXT+"\t\t"+same number+"\t\t"

所以，最终的结果是：

'TEXT \ t \ t24 \ t \ tblah blah blahTEXT \ t \ ttt \ t \ t \ t'...等等......

各种数字介于1-999之间，因此它需要某种通配符。

有人可以告诉我该怎么做吗？谢谢！

Answer 1

您需要使用Python的re库，特别是re.sub函数：

import re  # re is Python's regex library
SAMPLE_TEXT = "\t\t45\t\tbsadfd\t\t839\t\tds532\t\t0\t\t"  # Test text to run the regex on

# Run the regex using re.sub (for substitute)
# re.sub takes three arguments: the regex expression,
# a function to return the substituted text,
# and the text you're running the regex on.

# The regex looks for substrings of the form:
# Two tabs ("\t\t"), followed by one to three digits 0-9 ("[0-9]{1,3}"),
# followed by two more tabs.

# The lambda function takes in a match object x,
# and returns the full text of that object (x.group(0))
# with "TEXT" prepended.
output = re.sub("\t\t[0-9]{1,3}\t\t",
                lambda x: "TEXT" + x.group(0),
                SAMPLE_TEXT)

print output  # Print the resulting string.

用其他文本+相同的数字替换模式中的通配符号

1 个答案: