从两个多线字符串中获得差异的最佳方法是什么?
a = 'testing this is working \n testing this is working 1 \n'
b = 'testing this is working \n testing this is working 1 \n testing this is working 2'
diff = difflib.ndiff(a,b)
print ''.join(diff)
这会产生:
t e s t i n g t h i s i s w o r k i n g
t e s t i n g t h i s i s w o r k i n g 1
+ + t+ e+ s+ t+ i+ n+ g+ + t+ h+ i+ s+ + i+ s+ + w+ o+ r+ k+ i+ n+ g+ + 2
准确获取的最佳方法是什么:
testing this is working 2
?
正则表达式会成为解决方案吗?
答案 0 :(得分:3)
最简单的黑客,使用split()
来归功@Chris。
注意:您需要确定哪个是较长的字符串,并将其用于拆分。
if len(a)>len(b):
res=''.join(a.split(b)) #get diff
else:
res=''.join(b.split(a)) #get diff
print(res.strip()) #remove whitespace on either sides
#driver值
IN : a = 'testing this is working \n testing this is working 1 \n'
IN : b = 'testing this is working \n testing this is working 1 \n testing this is working 2'
OUT : testing this is working 2
编辑:感谢@ekhumoro使用replace
进行另一次黑客攻击,无需任何join
计算。
if len(a)>len(b):
res=a.replace(b,'') #get diff
else:
res=b.replace(a,'') #get diff
答案 1 :(得分:3)
a = 'testing this is working \n testing this is working 1 \n'
b = 'testing this is working \n testing this is working 1 \n testing this is working 2'
splitA = set(a.split("\n"))
splitB = set(b.split("\n"))
diff = splitB.difference(splitA)
diff = ", ".join(diff) # ' testing this is working 2, more things if there were...'
基本上使每个字符串成为一组行,并取出设定的差异 - 即B中所有不在A中的东西。然后取出该结果并将其全部加入一个字符串中。
编辑:这是一种流行的方式来说明@ShreyasG所说的话 - [x代表x如果x不在y中] ......
答案 2 :(得分:2)
这基本上是@ Godron629的答案,但由于我无法发表评论,我将在此处稍作修改:将difference
更改为symmetric_difference
,以便集合的顺序不会无所谓。
a = 'testing this is working \n testing this is working 1 \n'
b = 'testing this is working \n testing this is working 1 \n testing this is working 2'
splitA = set(a.split("\n"))
splitB = set(b.split("\n"))
diff = splitB.symmetric_difference(splitA)
diff = ", ".join(diff) # ' testing this is working 2, some more things...'
答案 3 :(得分:0)
在@Chris_Rands注释的基础上,您也可以使用splitlines()操作(如果您的字符串是多行,并且您希望该行不存在于一个而另一个中):
b_s = b.splitlines()
a_s = a.splitlines()
[x for x in b_s if x not in a_s]
预期输出为:
[' testing this is working 2']
答案 4 :(得分:0)
import itertools as it
"".join(y for x, y in it.zip_longest(a, b) if x != y)
# ' testing this is working 2'
可选地
import collections as ct
ca = ct.Counter(a.split("\n"))
cb = ct.Counter(b.split("\n"))
diff = cb - ca
"".join(diff.keys())
答案 5 :(得分:0)
您可以使用以下功能:
def __slave(a, b):
for i, l_a in enumerate(a):
if b == l_a:
return i
return -1
def diff(a, b):
t_b = b
c_i = 0
for c in a:
t_i = __slave(t_b, c)
if t_i != -1 and (t_i > c_i or t_i == c_i):
c_i = t_i
t_b = t_b[:c_i] + t_b[c_i+1:]
t_a = a
c_i = 0
for c in b:
t_i = __slave(t_a, c)
if t_i != -1 and (t_i > c_i or t_i == c_i):
c_i = t_i
t_a = t_a[:c_i] + t_a[c_i+1:]
return t_b + t_a
使用示例打印差异(a,b)