我尝试使用Python编写J48 parseTree算法 但是,我遇到了一个奇怪的问题:
def parseTree(f1):
line = f1.readline()
while not line.startswith("attribute"):
f2.write(line);
save = f1.tell();
line = f1.readline()
print f1.tell()
print f1.readline()
f1.seek(1518)
print f1.readline()
结果是:
1518
attribute22 > 0
te14 = Y
我很困惑为什么两个f1.readline()不一样
这是J48树的一部分:
=== Run information ===
Scheme:weka.classifiers.trees.J48 -C 0.25 -M 2
Relation: cls-weka.filters.unsupervised.attribute.Remove-R1,25-27,48-56
Instances: 60818
Attributes: 43
cert_category
attribute1
attribute2
attribute3
attribute4
attribute5
attribute6
attribute7
attribute8
attribute9
attribute10
attribute11
attribute12
attribute13
attribute14
attribute15
attribute16
attribute17
attribute18
attribute19
attribute20
attribute21
attribute22
attribute26
attribute27
attribute28
attribute29
attribute23_days
attribute24_days
attribute25_days
attribute30
attribute31
attribute32
attribute33
attribute34
attribute35
attribute36
attribute37
attribute38
attribute39
attribute40
attribute41
attribute42_num
Test mode:10-fold cross-validation
=== Classifier model (full training set) ===
J48 pruned tree
------------------
attribute22 <= 0: 4 (406.0)
attribute22 > 0
| attribute23_days <= 1
| | attribute14 = Y
| | | attribute37 = Y: 0 (60.0/2.0)
| | | attribute37 = N: 5 (17.0/1.0)
| | | attribute37 = A: 0 (0.0)
| | attribute14 = N
| | | attribute23_days <= 0: 5 (45.0)
| | | attribute23_days > 0
| | | | attribute2 <= 26: 5 (20.0)
| | | | attribute2 > 26
| | | | | attribute3 = Y: 5 (13.0)
| | | | | at