在字符串中查找关键字列表

时间:2013-09-06 14:48:06

标签: python string list csv

所以我有一个关键词列表,我试图检查我的csv表格中是否有任何这些单词,如果存在,则应该标记。我的代码完美无缺,除非该行有多个关键字,否则不会被标记。想法?

import sys
import csv
nk = ('aaa','bbb','ccc')
with open(sys.argv[1], "rb") as f:
    reader = csv.reader(f, delimiter = '\t')
    for row in reader:
        string=str(row)
        if any(word in string for word in nk):
            row.append('***')
            print '\t'.join(row)
        else:
            print '\t'.join(row)

提前致谢!

1 个答案:

答案 0 :(得分:0)

使用set intersection获取所有常用词:

nk = {'aaa','bbb','ccc'}
seen = set()             #keep as track of items seen so far in this set
with open(sys.argv[1], "rb") as f:
    ...
    for row in reader:
        #update `seen` with the items found common between `nk` and the current `row`
        seen.update(nk.intersection(row))
    ...

不要将row转换为字符串(string=str(row)),in运算符也适用于列表,其行为与字符串的in不同:

>>> strs = "['foo','abarc']"
>>> 'bar' in strs            #substring search
True
>>> lis = ['foo','abarc']    #item search
>>> 'bar' in lis
False