Python中的简单文本分类

时间:2013-05-17 20:06:50

标签: python

我想执行一个执行以下操作的简单文本分类: - 检查每个'损失描述'是否包含与灾难相关的关键字 - 如果是,则根据该类别对其进行分类;否则只显示'非灾难'

请指出我在代码中遇到的错误(datarecord中的第一行只是字段名称)或更有效的编写代码的方法:

disaster_cat = [(('lightning'),'lightning'),
                (('hurricane', 'sandy', 'irene', 'isaac', 'gustav'),'Hurricane'),
                (('tornado'),'Tornado'),
                (('flood'),'Flood'),
                (('wildfire', 'wild fire'),'Wild Fire')]

disaster_type = 'Non-Disaster' 
for record in datarecords[1:]:
    record.append(disaster_type) #pre-populate every field with 'Non-Disaster'

for record in datarecords[1:]:    
    for pairs in disaster_cat:        
        for phrase in pairs[0]:            
            if phrase in record[loss_desc_idx]: #check to see if the loss description contains kw
                record[-1] = pairs[1]           #if has kw, change disaster type 'Non-Disaster'
                                                #to appropriate diaster category

理想的最终结果,如果损失说明是“我的车被超级沙漠破坏了”,相应的灾难类型将是“飓风”。

1 个答案:

答案 0 :(得分:1)

要制作单元素元组,您需要在括号内插入逗号:

(('lightning',),'lightning')