从python中的给定文本挖掘关键字

时间:2017-07-23 05:26:16

标签: python-3.x

要挖掘的关键字列表:

keywords_list = ['object-oriented programming','information security industry','python','java programmer']

这是一个必须开采上述关键字的文字:

text = "***Python*** allows programmers to define their own types using classes and also ***python*** are most often used for ***object-oriented programming*** .***Python*** has also seen extensive use in the ***information security industry***, including in exploit development."

从上面的文字中我们必须挖掘给定的关键字,这些关键字可能在文本中,也可能不在文本中。此外,我们还要计算文本中存在的关键字数量。

输出:

mined_words = ['python','object-oriented programming','information security industry']

count = [3,1,1]

1 个答案:

答案 0 :(得分:0)

count_list = [text.lower().count(kw) for kw in keywords_list]
mined_text = dict(zip(keywords_list, count_list))
output = sorted(mined_text, key = mined_text.get, reverse=True)
count_output = [mined_text.get(out) for out in output]

将来,发布您的进度并寻求建议会更好。现在,希望这适合你。