遍历Fuzzywuzzy以获取列表列表

时间:2019-07-26 17:41:46

标签: python-3.x string list tuples

背景

我正在使用fuzzywuzzy软件包,并且具有以下示例列表:

from fuzzywuzzy import fuzz 
from fuzzywuzzy import process

token_name_list = [['John', 'D', 'Doe'], ['Jane', 'L' , 'More']]
token_text_list = [['Today', 'we', 'found', 'John', 'Doe', 'here', 'and', 'Jon', 'Does', 'car'], 
                ['We', 'also', 'found', 'Johns','sister', 'Jan', 'who', 'is', 'known', 'Jane', 'L', 'More' ]]

目标

我想使用process.extract中的fuzzywuzzy函数,该函数比较两个字符串并返回一个分数,例如('John', 100)-从上方循环浏览两个列表。如果我以非循环方式执行此操作,则它看起来像这样:

#'John' from token_name_list is compared to the 1st list in token_text_list
extract1 = process.extract(token_name_list[0][0],token_text_list[0], limit = 3, scorer = fuzz.ratio)

        [('John', 100), ('Jon', 86), ('found', 44)]

#'D' from token_name_list is compared to the 1st list in token_text_list
extract2 = process.extract(token_name_list[0][1],token_text_list[0], limit = 3, scorer = fuzz.ratio)

        [[('Doe', 50), ('and', 50), ('Does', 40)]

#'Doe' from token_name_list is compared to the 1st list in token_text_list
extract3 = process.extract(token_name_list[0][2],token_text_list[0], limit = 3, scorer = fuzz.ratio)

         [('Doe', 100), ('Does', 86), ('we', 40)]

#'Jane' from token_name_list is compared to the 2nd list in token_text_list
extract4 = process.extract(token_name_list[1][0],token_text_list[1], limit = 3, scorer = fuzz.ratio)

         [('Jane', 100), ('Jan', 86), ('Johns', 44)]

#'L' from token_name_list is compared to the 2nd list in token_text_list
extract5 = process.extract(token_name_list[1][1],token_text_list[1], limit = 3, scorer = fuzz.ratio)

         [('L', 100), ('also', 40), ('We', 0)]

#'More' from token_name_list is compared to the 2nd list in token_text_list     
extract6 = process.extract(token_name_list[1][2],token_text_list[1], limit = 3, scorer = fuzz.ratio)

         [('More', 100), ('We', 33), ('who', 29)]

尝试

我尝试了以下方法,但是它没有给我想要的东西

extract_list = []

for token_name in token_name_list:
    for name, text in zip(token_name, token_text_list):
            extract = process.extract(name,text, limit = 3, scorer = fuzz.ratio)
            extract_list.append(extract)

extract_list

[[('John', 100), ('Jon', 86), ('found', 44)],
 [('found', 33), ('We', 0), ('also', 0)],
 [('and', 57), ('Jon', 57), ('John', 50)],
 [('L', 100), ('also', 40), ('We', 0)]]

所需的输出

1)列表列表

          extract_list=[ [ [('John', 100), ('Jon', 86), ('found', 44)], 
                           [('Doe', 50), ('and', 50), ('Does', 40)], 
                           [('Doe', 100), ('Does', 86), ('we', 40)]  ],
                         [ [('Jane', 100), ('Jan', 86), ('Johns', 44)],
                           [('L', 100), ('also', 40), ('We', 0)],
                           [('More', 100), ('We', 33), ('who', 29)]  ] ]

问题

如何实现所需的输出?

0 个答案:

没有答案