Question

假设我有两个字符串：

s1 =“你好，这是一个测试”

s2 =“这是一个测试，正在测试。”

然后，如何提取不在s1中的s2字符串？

diff = function(s1,s2) 
print(diff)

“，测试测试。”

Answer 1

也许这会有所帮助：

s1 = "hello how are you ?"

s2 = "hello"
if s2 in s1:
    print(s1.replace(s2, ''))
elif s1 in s2:
    print(s2.replace(s1, ''))
else:
    print('Not a substring')

更新

然后使用它：

s1 = "hello how are you, this is a test"

s2 = "this is a test, testing testing."

diff = ''
for word in s2.split():
    if word not in s1.split():
        diff += word + ' '
print(diff)

Answer 2

最好是levenshtein算法，如果愿意，您可以计算2个句子之间的距离（将一个字符转换为另一个字符需要多少个字符替换）或相似率：

>>> import Levenshtein
>>> Levenshtein.distance( 'hello, guys', 'hello, girls' )
3
>>> Levenshtein.ratio( 'hello, guys', 'hello, girls' )
0.782608695652174

您可以在此处查看实现的详细信息和其他信息：https://en.wikipedia.org/wiki/Levenshtein_distance

提取两个字符串之间的差异

2 个答案: