Question

这些是2个列表：-

list1 = ['apple pie', 'apple cake', 'the apple pie', 'the apple cake', 'apple']

list2 = ['apple', 'lots of apple', 'here is an apple', 'humungous apple', 'carrot cake']

我尝试了一种名为longest Substring finder的算法，但顾名思义，它没有返回我要寻找的内容。

def longestSubstringFinder(string1, string2):
    answer = "NULL"
    len1, len2 = len(string1), len(string2)
    for i in range(len1):
        match = ""
        for j in range(len2):
            if (i + j < len1 and string1[i + j] == string2[j]):
                match += string2[j]
            else:
                if (len(match) > len(answer)): answer = match
                match = ""
    return answer


mylist = []

def call():
    for i in file_names_short:
        s1 = i
        for j in company_list:
            s2 = j
            s1 = s1.lower()
            s2 = s2.lower()
            while(longestSubstringFinder(s2,s1) != "NULL"):
                x = longestSubstringFinder(s2,s1)
                # print(x)
                mylist.append(x)
                s2 = s2.replace(x, ' ')

call()
print('[%s]' % ','.join(map(str, mylist)))

预期输出应为：

output = ['apple', 'apple', 'apple', 'apple', '']

apple一词并不总是固定为apple，它是一个较大的列表，其中包含许多单词，但是我一直在寻找两个列表和apple中匹配的单词始终是list1

中的最长单词

另一个示例（可能更清晰）：

string1 = ['Walgreens & Co.', 'Amazon Inc''] 
string2 = ['walgreens customers', 'amazon products', 'other words'] 
output = ['walgreens', 'amazon', '']

Answer 1

编辑：编辑以获取最长的匹配

list1 = ['apple pie cucumber', 'apple cake', 'the apple pie', 'the apple cake', 'apple']
list2 = ['apple cucumber', 'lots of apple', 'here is an apple', 'humungous apple', 'carrot cake']

result = []

for i in range(len(list1)):
    match = []
    words1, words2 = list1[i].split(), list2[i].split()
    for w in words1:
        if w in words2:
            match.append(w)

    longest = max(match, key=lambda x: len(x)) if match else ''
    result.append(longest)

print(result)

输出：

['cucumber', 'apple', 'apple', 'apple', '']

仅当列表中完全包含另一个字符串的子字符串时，才如何在列表中找到匹配的子字符串？

1 个答案: