假设我有两个字符串:
s1 =“你好,这是一个测试”
s2 =“这是一个测试,正在测试。”
然后,如何提取不在s1中的s2字符串?
diff = function(s1,s2)
print(diff)
“,测试测试。”
答案 0 :(得分:1)
也许这会有所帮助:
s1 = "hello how are you ?"
s2 = "hello"
if s2 in s1:
print(s1.replace(s2, ''))
elif s1 in s2:
print(s2.replace(s1, ''))
else:
print('Not a substring')
更新
然后使用它:
s1 = "hello how are you, this is a test"
s2 = "this is a test, testing testing."
diff = ''
for word in s2.split():
if word not in s1.split():
diff += word + ' '
print(diff)
答案 1 :(得分:1)
最好是levenshtein算法,如果愿意,您可以计算2个句子之间的距离(将一个字符转换为另一个字符需要多少个字符替换)或相似率:
>>> import Levenshtein
>>> Levenshtein.distance( 'hello, guys', 'hello, girls' )
3
>>> Levenshtein.ratio( 'hello, guys', 'hello, girls' )
0.782608695652174
您可以在此处查看实现的详细信息和其他信息:https://en.wikipedia.org/wiki/Levenshtein_distance