使用算法进行字符串匹配-python

时间:2019-07-10 05:59:56

标签: python

唯一的伊凡(韩文版)

唯一的伊凡CD

唯一的伊凡

唯一的伊凡(Turtleback学校和图书馆装订版)

唯一的伊凡

唯一的伊凡:竖琴经典之作

唯一的伊万全彩珍藏版

唯一的伊凡

唯一的伊凡

一个也只有伊万

巴黎大赛(一本石巴灵顿小说)

唯一的伊凡

这里的一个字符串很奇怪。 我该如何编写python脚本来取出那个奇怪的东西

1 个答案:

答案 0 :(得分:0)

尝试difflib.SequenceMatcher

from difflib import SequenceMatcher
l = ['The One and Only Ivan (Korean Edition)', 'The One and Only Ivan CD', 'The One and Only Ivan', 'The One And Only Ivan (Turtleback School & Library Binding Edition)', 'The One and Only Ivan', 'The One and Only Ivan: A Harper Classic', "The One and Only Ivan Full-Color Collector's Edition", 'The One And Only Ivan', 'The One and Only Ivan', 'One and Only Ivan', 'Paris Match (A Stone Barrington Novel)', 'The One and Only Ivan']
print([i for i in l if all(SequenceMatcher(None, i, x).ratio() < 0.2 for x in l)])

输出:

['Paris Match (A Stone Barrington Novel)']