我需要在这个特定模式中找到大文本字符串的所有部分:
"\t\t" + number (between 1-999) + "\t\t"
然后用:
替换每个匹配项TEXT+"\t\t"+same number+"\t\t"
所以,最终的结果是:
'TEXT \ t \ t24 \ t \ tblah blah blahTEXT \ t \ ttt \ t \ t \ t'...等等......
各种数字介于1-999之间,因此它需要某种通配符。
有人可以告诉我该怎么做吗?谢谢!
答案 0 :(得分:0)
您需要使用Python的re
库,特别是re.sub
函数:
import re # re is Python's regex library
SAMPLE_TEXT = "\t\t45\t\tbsadfd\t\t839\t\tds532\t\t0\t\t" # Test text to run the regex on
# Run the regex using re.sub (for substitute)
# re.sub takes three arguments: the regex expression,
# a function to return the substituted text,
# and the text you're running the regex on.
# The regex looks for substrings of the form:
# Two tabs ("\t\t"), followed by one to three digits 0-9 ("[0-9]{1,3}"),
# followed by two more tabs.
# The lambda function takes in a match object x,
# and returns the full text of that object (x.group(0))
# with "TEXT" prepended.
output = re.sub("\t\t[0-9]{1,3}\t\t",
lambda x: "TEXT" + x.group(0),
SAMPLE_TEXT)
print output # Print the resulting string.