Question

我正在Python 3中构建一个程序，该程序需要遍历两个列表并计算第一个列表的元素在第二个列表中出现的次数。但是，即使我插入了两个经过硬编码的具有共同元素的列表，Python也说该列表没有任何共同元素。

这是我程序的最低版本，可运行版本：

strings = ["I sell","seashells","by the","seashore"]
ngramSet = ["by the"]
for ngram in ngramSet:
    print("Ngram: \"" + str(ngram) + "\"")
    # Should return "by the" twice where it appears twice.
    occurrences = [element for element in strings if element is ngram]
    print("Occurrences: " + str(occurrences))
    count = len(occurrences)
    print("Number of times N-gram appears in string" + str(count))

输出：

Ngram: "by the"
Occurrences: []
Number of times N-gram appears in string0

Answer 1

您的方法是正确的。唯一的问题是在lambda中，您可以在此使用is比较两个字符串。您应该使用==比较它们，因为您正在进行相等比较。

Answer 2

collections.Counter是为此而设计的！

import collections

strings = ["I sell","seashells","by the","seashore"]
ngramSet = ["by the"]
strings_counter = collections.Counter(strings)

for string in ngramSet:
    print(string, strings_counter[string])

Answer 3

您可以使其简短明了，有多种方法，但这是一种方法：

strings = ["I sell","seashells","by the","seashore"]
ngramSet = ["by the"]

# Built in count function
for x in ngramSet:
        print (x, " -> ", strings.count(x))

# Or make it a one-liner
print ([(arg, strings.count(arg)) for arg in ngramSet])

或者您可以只使用您的代码，因为它似乎对我有用。

Answer 4

如果您只想获取常见元素，请尝试设置：

list(set(strings).intersection(set(ngramSet)))

Python在列表中找不到通用元素

4 个答案: