如何使用列表功能列表Python训练分类器

时间:2017-08-15 03:57:29

标签: python list machine-learning scikit-learn classification

我有两个变量叫做实体和标签。实体变量存储单词列表,此列表中的每个元素也包含列表。所以它是列表变量的列表。这个列表实际上是一个二元组功能,所以我需要保留它。

我尝试使用这两个变量训练分类器。到目前为止我的代码:

from sklearn import svm
from sklearn.feature_extraction.text import TfidfVectorizer

entity = [[['Prabowo Subianto']], [['Muhtar Ependi']], [['Nina Zatulini']], [['Partai Gerindra']], [['Persiba']], [['Partai Kebangkitan Bangsa (PKB)'], ['Partai Kebangkitan'], ['Kebangkitan Bangsa'], ['Bangsa ('], ['( PKB'], ['PKB )']], [['Sman 3 Kabupaten Tangerang'], ['Sman 3'], ['3 Kabupaten'], ['Kabupaten Tangerang']], [['Bandara Changi Singapura'], ['Bandara Changi'], ['Changi Singapura']], [['Warung Kopi Kita'], ['Warung Kopi'], ['Kopi Kita']]]
label = ['PERSON', 'PERSON', 'PERSON', 'ORGANIZATION', 'ORGANIZATION', 'ORGANIZATION', 'LOCATION', 'LOCATION', 'LOCATION']

vectorizer = TfidfVectorizer(min_df=1)
train_vector_entity = vectorizer.fit_transform(entity)
train_vector_label = label

classifier = svm.SVC()
classifier_word = classifier.fit(train_vector_entity,train_vector_label)

错误结果:

AttributeError: 'list' object has no attribute 'lower'

训练分类器的最佳方法是什么? 感谢

1 个答案:

答案 0 :(得分:0)

只需更改此行:

train_vector_entity = vectorizer.fit_transform([i[0][0] for i in entity])