我试图测试从csv文件加载数据到TextBlob以创建分类器,然后我将使用训练集进行测试。我开始使用一个小的csv来测试它并确保它工作,然后我为真实的东西写了数百行。但是,我遇到了一个问题。我正在使用TextBlob网站上列出的示例代码:
from textblob.classifiers import NaiveBayesClassifier
with open('testerdict.csv',) as fp:
cl = NaiveBayesClassifier(fp, format="csv")
test = [
('The beer was good.', 'pos'),
('I do not enjoy my job', 'neg'),
("I ain't feeling dandy today.", 'neg'),
("I feel amazing!", 'pos'),
('Gary is a friend of mine.', 'pos'),
("I can't believe I'm doing this.", 'neg')
]
cl = NaiveBayesClassifier(fp)
print(cl.classify("This is an amazing library!"))
print(cl.accuracy(test))
我收到的错误取决于我如何格式化我的csv文件,所以我认为问题就在那里。如果我这样格式化:
我收到此错误:ValueError:关闭文件的I / O操作。
如果我这样格式化:
我收到此错误:ValueError:没有足够的值来解包(预期2,得到1)
就像我说的那样,我刚刚制作了一个非常快速和基本的csv,因此我可以理解如何正确地格式化它并将其加载到TextBlob中以便为分类器训练数据。有没有人对可能出错的地方有任何想法?
提前致谢。
编辑:按要求完全追溯:
File "C:\Users\Ver\AppData\Local\Programs\Python\Python36-32\SA Project\Work\tester.py", line 14, in <module>
cl = NaiveBayesClassifier(fp)
File "C:\Users\Ver\AppData\Local\Programs\Python\Python36-32\lib\site-packages\textblob\classifiers.py", line 205, in __init__
super(NLTKClassifier, self).__init__(train_set, feature_extractor, format, **kwargs)
File "C:\Users\Ver\AppData\Local\Programs\Python\Python36-32\lib\site-packages\textblob\classifiers.py", line 136, in __init__
self.train_set = self._read_data(train_set, format)
File "C:\Users\Ver\AppData\Local\Programs\Python\Python36-32\lib\site-packages\textblob\classifiers.py", line 148, in _read_data
format_class = formats.detect(dataset)
File "C:\Users\Ver\AppData\Local\Programs\Python\Python36-32\lib\site-packages\textblob\formats.py", line 145, in detect
if Format.detect(fp.read(max_read)):
ValueError: I/O operation on closed file.
答案 0 :(得分:0)
这对我有用,第二次调用NaiveBayesClassifier已被删除。是否有必要再次调用构造函数?好像它已经在with语句中完成了:
from textblob.classifiers import NaiveBayesClassifier
with open('<path to your file>.csv',) as fp:
cl = NaiveBayesClassifier(fp, format="csv")
test = [('The beer was good.', 'pos'),
('I do not enjoy my job', 'neg'),
("I ain't feeling dandy today.", 'neg'),
("I feel amazing!", 'pos'),
('Gary is a friend of mine.', 'pos'),
("I can't believe I'm doing this.", 'neg')]
#cl = NaiveBayesClassifier(fp)
print(cl.classify("This is an amazing library!"))
print(cl.accuracy(test))
我使用了此处描述的CSV格式:http://textblob.readthedocs.io/en/dev/classifiers.html#classifying-text