Python - 尝试根据关键字和关联ID列表对文章进行分类

时间:2016-01-05 08:50:07

标签: python python-3.x

我正在尝试根据包含两列的db表对文章进行分类,如下所示:

id   keywords
1    cat, kitten, tiger
2    dog, puppy, jackal

如果我有一篇文章,我如何确定哪些关键字出现在其中,以及我需要使用哪个ID来对文章进行分类?到目前为止,我的代码如下:

cur.execute("SELECT keywords, id FROM Keywords")
keywords = cur.fetchall()
keywords = [k[0] for k in keywords]
if any(word in article for word in keywords):
    matched = [word for word in keywords if word in article]
    print("Matched keywords: %s" % ', '.join(matched))

1 个答案:

答案 0 :(得分:1)

如果关键字是以逗号分隔的关键字列表,则您希望拆分该字符串。尝试这样的事情:

cur.execute("SELECT keywords, id FROM Keywords")
result = cur.fetchall()
keywords = []
for row in result:
    keywords += row[0].split(',')
if any(word in article for word in keywords):
    matched = [word for word in keywords if word in article]
    print("Matched keywords: %s" % ', '.join(matched))