Python中arff库中的dump
命令使用户能够根据给定的输入创建一个arff文件,例如:命令:
arff.dump("outputDir", data, relation="relation1",
names=['age, fatRatio, hairColor'])
产生以下arff:
@relation relation1
@attribute age real
@attribute hairColor string
@data
10,0.2,black
22,10,yellow
30,2,black
给出的数据:
data = [[10,0.2,'black'],[22,10,'yellow'],[30,2,'black']]
我的问题是:如何通知相关机制我希望hairColor
成为名义属性,即我希望我的arff标题如下:
@relation relation1
@attribute age real
@attribute hairColor **nominal**
@data
...
答案 0 :(得分:0)
这里概述了几种不同的方法:
https://code.google.com/p/arff/wiki/Documentation
我认为对我来说更好的方法是推荐这个的第二个方法:
arff_writer = arff.Writer(fname, relation='diabetics_data', names)
arff_writer.pytypes[arff.nominal] = '{not_parasite,parasite}'
arff_writer.write([arff.nominal('parasite')])
如果你看一下arff.nominal的代码,它的定义如下:
class Nominal(str):
"""Use this class to wrap strings which are intended to be nominals
and shouldn't have enclosing quote signs."""
def __repr__(self):
return self
所以我所做的就是在我的属性中为每个名义创建一个不同的“包装”标称类,如下所示:
class ZipCode(str):
"""Use this class to wrap strings which are intended to be nominals
and shouldn't have enclosing quote signs."""
def __repr__(self):
return self
然后按照上面的代码,您可以执行以下操作:
arff_writer = arff.Writer(fname, relation='neighborhood_data', names)
arff_writer.pytypes[type(myZipCodeObject)] = '{85104,84095}'
# then write out the rest of your attributes...
arff_writer.write([arff.nominal('parasite')])