Question

我有一个数组

array(['apple', 'hello', 'world'])

其中我使用numpy的np.tile和np.repeat方法创建了两个数组，而这两个数组如下：

第一个数组：

term1 = ['apple', 'apple', 'apple', 'hello', 'hello', 'hello','world','world', 'world']

第二个数组：

term2 = ['apple', 'hello', 'world', 'apple', 'hello', 'world', 'apple', 'hello', 'world']

之后，我的代码如下：

terms = list(zip(term1,term2))
scores = [function1(frozenset(t)) for t in terms]

我有一个功能

from fuzzywuzzy import fuzz
from fuzzywuzzy import process
def function1(terms):
    if len(terms) == 1:
        return 100
    return fuzz.token_set_ratio(*terms)

上面的代码可以正常工作，但是需要更多时间。现在我想知道是否将term1和term2数组更改为：

[[apple   apple   apple],
 [hello   hello   hello],
 [world   world   world]]


[[apple   hello   world],
 [apple   hello   world],
 [apple   hello   world]]

，我想同时选择apple和apple，apple和hello等，同样，我想选择所有元素，并将其传递给function1。有没有办法做到这一点（例如在这两个数组上使用apply()并使其像元素操作一样？）？

如何计算字符串之间的有效距离矩阵？

0 个答案: