在python中提取不匹配的字符串

时间:2017-10-22 15:49:04

标签: python string python-3.x

我有两个字符串s和s2

s = "catwalksonterrace9_ontheweekend_at7am"
s2= "catwalksonterrace$no_ontheweekend_at.*"

我需要比较两个字符串并提取不匹配的部分 来自Python中两个字符串的$no = 9.* = 7am。我是python的新手,我将如何实现这一目标?

2 个答案:

答案 0 :(得分:1)

看看difflib,它很棒,可以做到你想要的:) https://docs.python.org/2/library/difflib.html

import difflib
d = difflib.Differ()
diffs = []
in_diff = False
for c in d.compare(s, s2):
    if not in_diff and (c.startswith("+") or c.startswith("-")):
        diffs.append(["", ""])
        in_diff = True
    if c.startswith("+"):
        diffs[-1][0] += c.replace("+ ", "")
    elif c.startswith("-"):
        diffs[-1][1] += c.replace("- ", "")
    else:
        in_diff = False
print(diffs)

这会创建一个列表列表,其中每个子列表的第一个值在第1行是diff,而seocnd值在第2行是diff

输出将是:

[['$no', '9'], ['.*', '7am']]

然后您可以循环显示,按要求打印出来:

for diff in diffs:
    print(diff[0], "=", diff[1])

答案 1 :(得分:0)

使用difflib.unified_diff()功能:

import difflib, itertools

s1 = "catwalksonterrace9_ontheweekend_at7am"
s2 = "catwalksonterrace$no_ontheweekend_at.*"

result = []
for i in itertools.islice(difflib.unified_diff(s1, s2, lineterm=''), 2, None):
    if i.startswith('@@'):    # diff control line
        result.append(['',''])
    elif i.startswith('-'):   # line unique to sequence 1
        result[-1][0] += i[1:]
    elif i.startswith('+'):   # line unique to sequence 2
        result[-1][1] += i[1:]

print(result)

输出:

[['9', '$no'], ['7am', '.*']]

生成的result列表中的每个子列表都包含一对&#34; &#34;和&#34; &#34;价值(即[<old>, <new>]