我想替换以下内容:
default by <http://www.mycompany.com/>
db: by <http://www.mydbcompany.com/>
我有以下格式的数据:
<a> <b> <c>.
<d> db:connect <e>.
db:start <f> <g>.
<h> <i> "hello".
现在我想将这些数据转换为以下形式:
<http://www.mycompany.com/a> <http://www.mycompany.com/b> <http://www.mycompany.com/c>.
<http://www.mycompany.com/d> <http://www.mydbcompany.com/connect> <http://www.mycompany.com/e>.
<http://www.mydbcompany.com/start> <http://www.mycompany.com/f> <http://www.mycompany.com/g>.
<http://www.mycompany.com/h> <http://www.mycompany.com/i> "hello".
现在我尝试实现所需格式的方法是使用以下方法分隔每行:
line1=re.split('(?<=)\s+(?=<)',line)
然后为line1 [0],line1 [1],line1 [2]我试着
substitute < by <http://www.mycompany.com/
但是,我的问题是这种方法不适用于db:和引号。有没有办法在python中实现所需的输出
答案 0 :(得分:1)
为什么不re.sub
?
S = """\
<a> <b> <c>.
<d> db:connect <e>.
db:start <f> <g>.
<h> <i> "hello".
"""
import re
expand_tags = re.sub(r"<(.*?)>", r"<http://www.mycompany.com/\1>", S)
expand_db = re.sub(r"db:(.*?)\s", r"<http://www.mydbcompany.com/\1>", expand_tags)
print(expand_db)
#>>> <http://www.mycompany.com/a> <http://www.mycompany.com/b> <http://www.mycompany.com/c>.
#>>> <http://www.mycompany.com/d> <http://www.mydbcompany.com/connect><http://www.mycompany.com/e>.
#>>> <http://www.mydbcompany.com/start><http://www.mycompany.com/f> <http://www.mycompany.com/g>.
#>>> <http://www.mycompany.com/h> <http://www.mycompany.com/i> "hello".
第二部分中的\1
表示第一部分中括号内的内容,因此您可以匹配该模式并将其放入替换中。但是,这似乎很奇怪,所以你可能想重新考虑整个设计。