使用
对数据执行C 5.0算法后a <- C5.0(FACTOR~.,data = i_data,trials=10,costs = matrix(c(0,1,4,0), nrow = 2))
当我使用
找到模型的摘要时summary(a)
我得到这样的东西,
.
.
.
.
SubTree [S1]
Col_L > 89: N (195.6/6.5)
Col_L <= 89:
:...Col_Q > 4657: Y (66.6/34)
Col_Q <= 4657:
:...Col_F > 15: Y (117.6/75)
Col_F <= 15:
:...Col_C <= 5.6926: N (2040.5/266.7)
Col_C > 5.6926: Y (148.7/104.4)
SubTree [S2]
Col_E > 14: N (2523.3/176.8)
Col_E <= 14:
:...Col_G > 5: N (83.4/1.4)
Col_G <= 5:
:...Col_O > 6880: Y (41.8/22)
Col_O <= 6880:
:...Col_G <= 3: N (1939.9/230.1)
Col_G > 3: Y (92.7/64.5)
Evaluation on training data (53392 cases):
Trial Decision Tree
----- -----------------------
Size Errors Cost
0 87 16173(30.3%) 0.35
1 25 14071(26.4%) 0.43
2 48 15295(28.6%) 0.74
3 50 14672(27.5%) 0.48
4 43 16765(31.4%) 0.55
5 52 16346(30.6%) 0.98
6 58 18277(34.2%) 0.52
7 65 13940(26.1%) 0.64
8 63 14020(26.3%) 0.42
9 57 13517(25.3%) 0.45
boost 13284(24.9%) 0.39 <<
(a) (b) <-classified as
---- ----
15848 10848 (a): class N
2436 24260 (b): class Y
Attribute usage:
100.00% Col_A
100.00% Col_B
100.00% Col_C
100.00% Col_D
100.00% Col_E
99.79% Col_F
99.63% Col_G
76.66% Col_H
76.55% Col_I
75.64% Col_J
70.22% Col_K
65.15% Col_L
59.01% Col_M
58.94% Col_N
42.54% Col_O
33.01% Col_P
21.73% Col_Q
16.58% Col_R
12.69% Col_S
8.43% Col_T
有没有办法提取,
(a) (b) <-classified as
---- ----
15848 10848 (a): class N
2436 24260 (b): class Y
来自上面的摘要,以便我可以在另一个R实例中加载它?
答案 0 :(得分:1)
C5.0
将其保存为文字,但您可以将其导出为:
#example from ?C5.0
data(churn)
treeModel <- C5.0(x = churnTrain[, -20], y = churnTrain$churn)
treeModel
#saves summary in b
#b$output is the printed text
b <- summary(treeModel)
#get position of '(a)'
pos1 <- gregexpr(pattern ='\\(a\\)', b$output)[[1]][1]
#get position of 'class no' - in your case should be class Y
pos2 <- gregexpr(pattern ='class no', b$output)[[1]][1]
#substring using the above
text <- substr(b$output, pos1, pos2)
#print
cat(text)
输出:
(a) (b) <-classified as
---- ----
365 118 (a): class yes
18 2832 (b): c