我在Windows 7上安装了weka python包装器。我尝试运行示例代码:
import weka.core.jvm as jvm
jvm.start()
data_dir = "E:/Files/Fourth/"
from weka.core.converters import Loader
loader = Loader("weka.core.converters.TextDirectoryLoader")
datasets = [
data_dir + "File 1",
data_dir + "File 2",
data_dir + "File 3",
data_dir + "File 4",
data_dir + "File 5"
]
data = loader.load_file(datasets)
data.delete_last_attribute()
print(data)
我收到以下错误:
Traceback (most recent call last):
File "C:/Python27/weekaa.py", line 16, in <module>
data = loader.load_file(datasets)
File "C:\Python27\lib\site-packages\weka\core\converters.py", line 67, in load_file
self.enforce_type(self.jobject, "weka.core.converters.FileSourcedConverter")
File "C:\Python27\lib\site-packages\weka\core\classes.py", line 155, in enforce_type
raise TypeError("Object does not implement or subclass " + intf_or_class + "!")
TypeError: Object does not implement or
subclass weka.core.converters.FileSourcedConverter!
我通过向weka.jar或python-weka-wrapper添加类路径但在以前的问题(在stackoverflow中)尝试了解决方案但是没有用。加载.arff文件类型时不会出现错误。
是否有加载文本文件的解决方案?
注意:数据集中的每个文件都有一组文本文档文件(供以后的群集使用)
答案 0 :(得分:0)
Weka的TextDirectoryLoader
类不能与 python-weka-wrapper 一起使用,最高版本为0.2.2。即将发布的0.2.3版(或github repository)包含一个名为TextDirectoryLoader
的新Python包装器,可从weka.core.converters
模块获得,允许您立即使用此类。这也在python-weka-wrapper mailing list上得到了解答。
from weka.core.converters import TextDirectoryLoader
text_dir = "/the/directory/you/want/to/load"
loader = TextDirectoryLoader(options=["-dir", text_dir, "-F", "-charset", "UTF-8"])
data = loader.load()
print(unicode(data))