我正在使用以下功能对主题中的句子进行分类
def theme(x):
output =[]
category = ()
for i in x:
if 'AC' in i:
category = 'AC problem'
elif 'insects' in i:
category = 'Cleanliness'
elif 'clean' in i:
category = 'Cleanliness'
elif 'food' in i:
category = 'Food Problem'
elif 'delay' in i:
category = 'Train Delayed'
else:
category = 'None'
output.append(category)
return output
我不想对类别中的每个单词使用重复的if语句。相反,我想给我一个清单/字典,例如Cleanliness = ['Clean', 'Cleaned', 'spoilt', 'dirty']
用于针对句子中包含任何单词的句子获取“清洁度”类别。我该怎么办
答案 0 :(得分:1)
您可以使用集合的字典来按类别组织单词,然后根据所述结构生成单词到类别的查找字典:
categories = {
'Cleanliness': {'insects', 'clean'},
'AC Problem': {'AC'},
'Food Problem': {'food'},
'Train Delayed': {'delay'}
}
lookup = {word: category for category, words in categories.items() for word in words}
def theme(x):
return {lookup.get(word, 'None') for word in x}
以便theme(['AC', 'clean', 'insects'])
将返回一组相应的类别:
{'Cleanliness', 'AC Problem'}
答案 1 :(得分:1)
这应该满足您的要求。我将所有键设置为小写字母,并在检查是否找到匹配项时将i转换为小写字母,但是大写不同,它仍然很重要。
def theme(x):
output =[]
category = ()
myDict = {"ac":"AC problem", "insects":"Cleanliness", "clean":"Cleanliness", "food":"Food Problem", "delay":"Train Delayed"} #I reccomend coming up with a more suitable name for your dictionary in your actual program
for i in x:
if i.lower() in myDict: #Checks to see if i is in the dictionary before trying to print the result; prevents possible Key Errors
category = (myDict[i.lower()]) #If it is in the dictionary it category will be set to the result of the key
output.append(category)
else:
output.append("None") #If i isn't in the dictionary output will append None instead
return output
以下是一些示例:
>>>print(theme(['Clean', 'Cleaned', 'spoilt', 'dirty']))
['Cleanliness', 'None', 'None', 'None']
>>>print(theme(['Delay', 'Ham', 'Cheese', 'Insects']))
['Train Delayed', 'None', 'None', 'Cleanliness']
答案 2 :(得分:0)
我想出了另一种方法:
def theme(x):
output = []
for i in x:
if set(cleanliness).intersection(i.lower().split()):
category = 'clean'
elif set(ac_problem).intersection(i.lower().split()):
category = 'ac problem'
else:
category = 'none'
output.append(category)
return output
答案 3 :(得分:-1)
也许您可以这样:
def theme(x): output = [] name_dic = {"AC": "AC problem", "clean": "Cleanliness", "food": "Food Problem" } for e in x: output.append(name_dic.get(e)) return output
或更确切地说是这样:
def theme(x): output = [] name_list = [ ("AC", "AC problem"), ("clean", "Cleanliness"), ("insects", "Cleanliness"), ("food", "Food Problem") ] name_dic = dict(name_list) for e in x: output.append(name_dic.get(e)) return output
希望有帮助。