我有两个字符串s和s2
s = "catwalksonterrace9_ontheweekend_at7am"
s2= "catwalksonterrace$no_ontheweekend_at.*"
我需要比较两个字符串并提取不匹配的部分
来自Python中两个字符串的$no = 9
和.* = 7am
。我是python的新手,我将如何实现这一目标?
答案 0 :(得分:1)
看看difflib,它很棒,可以做到你想要的:) https://docs.python.org/2/library/difflib.html
import difflib
d = difflib.Differ()
diffs = []
in_diff = False
for c in d.compare(s, s2):
if not in_diff and (c.startswith("+") or c.startswith("-")):
diffs.append(["", ""])
in_diff = True
if c.startswith("+"):
diffs[-1][0] += c.replace("+ ", "")
elif c.startswith("-"):
diffs[-1][1] += c.replace("- ", "")
else:
in_diff = False
print(diffs)
这会创建一个列表列表,其中每个子列表的第一个值在第1行是diff,而seocnd值在第2行是diff
输出将是:
[['$no', '9'], ['.*', '7am']]
然后您可以循环显示,按要求打印出来:
for diff in diffs:
print(diff[0], "=", diff[1])
答案 1 :(得分:0)
import difflib, itertools
s1 = "catwalksonterrace9_ontheweekend_at7am"
s2 = "catwalksonterrace$no_ontheweekend_at.*"
result = []
for i in itertools.islice(difflib.unified_diff(s1, s2, lineterm=''), 2, None):
if i.startswith('@@'): # diff control line
result.append(['',''])
elif i.startswith('-'): # line unique to sequence 1
result[-1][0] += i[1:]
elif i.startswith('+'): # line unique to sequence 2
result[-1][1] += i[1:]
print(result)
输出:
[['9', '$no'], ['7am', '.*']]
生成的result
列表中的每个子列表都包含一对&#34; 旧&#34;和&#34; 新&#34;价值(即[<old>, <new>]
)