关键字组合存在于文件中

时间:2017-06-09 11:41:23

标签: python

我在python中有以下列表列表

combos = [list(x) for x in itertools.permutations(keywords_list, 2)]

看起来像这样:

combos
[['revenue', 'margins'], ['revenue', 'liquidity'], ['revenue', 'ratio'], ['revenue', 'pricing'], ['revenue', 'assets'], ['revenue', 'recent trends']]

现在我的目标是检查文本文件中是否存在每个关键字对,并计算否。关键字对存在的次数。

n_occurence = defaultdict(lambda:0)
with open(file_path) as f:
    for line in f:
        for item in combos:
            if item[1] and item[2] in line:
                n_occurence[item] +=1

我收到以下错误

IndexError: list index out of range

我该如何处理?

1 个答案:

答案 0 :(得分:0)

您可以使用re模块

执行此操作
import re
data = [['revenue', 'margins'], ['revenue', 'liquidity'], ['revenue', 'ratio'], ['revenue', 'pricing'], ['revenue', 'assets'], ['revenue', 'recent trends']]
with open('a.txt') as f:
    txt = f.read()
    for d in data:
        c1 = re.findall(d[0],txt)
        c2 = re.findall(d[1],txt)
        if c1 and c2:
            print {c1[0]:len(c1),c2[0]:len(c2)}

输出

{'margins': 1, 'revenue': 2}
{'liquidity': 1, 'revenue': 2}