从决策树函数JRip(RWeka库)访问单个结果

时间:2014-05-05 18:32:30

标签: r

我正在使用库(RWeka)并在数据集上运行JRip函数。有没有人知道以编程方式访问规则结果集的方法,以便我可以单独访问每个规则?

以下是仅供说明用途的示例:

> library(datasets)
> head(npk)
block N P K yield
1     1 0 1 1  49.5
2     1 1 1 0  62.8
3     1 0 0 0  46.8
4     1 1 0 1  57.0
5     2 1 0 0  59.8
6     2 1 1 1  58.5
> tree_rip <- JRip(block ~ ., data = npk)
> tree_rip
JRIP rules:
===========

(yield <= 48.8) => block=4 (5.0/2.0)
(yield <= 52) => block=5 (4.0/1.0)
=> block=3 (15.0/11.0)

Number of Rules : 3

我想以数据帧/表格方式访问结果。最接近的是以下列方式检索单个blob字符串:

> tree_rip$classifier
[1] "Java-Object{JRIP rules:\n===========\n\n(yield <= 48.8) => block=4 (5.0/2.0)\n(yield <= 52) => block=5 (4.0/1.0)\n => block=3 (15.0/11.0)\n\nNumber of Rules : 3\n}"

我需要一些能让我分别得到每个结果的东西,就像我打电话给tree_rip时打印一样,所以我不仅可以获得找到的规则长度,而且可以逐个访问它们。

至少这样的事情(但理想情况下,每行分别访问每个结果变量):

[1] (yield <= 48.8) => block=4 (5.0/2.0)
[2] (yield <= 52) => block=5 (4.0/1.0)
...

谢谢!

1 个答案:

答案 0 :(得分:1)

这对我来说非常困难,因为他不是R与R的集成用户。无论如何,在查看这些结果后,努力了解REPL如何产生您所看到的结果:

str(tree_rip)
# omitting about 15 lines of output
# - attr(*, "class")= chr [1:3] "JRip" "Weka_rules" "Weka_classifier"

getAnywhere(print.JRIP)
# no object named ‘print.JRIP’ was found
getAnywhere(print.Weka_rules)
# no object named ‘print.Weka_rules’ was found
help(pack="RWeka")
getAnywhere(print.Weka_classifier)
# this did succeed ... so I though `.jcall` should also succeed

.jcall(tree_rip$classifier, "S", "toString")
 #    Error: could not find function ".jcall"
 RWeka:::.jcall(tree_rip$classifier, "S", "toString")
 #    Error in get(name, envir = asNamespace(pkg), inherits = FALSE) : 
 #      object '.jcall' not found

...我发现需要加载pkg:rJava才能访问.jcall function。显然,这是支持库未加载但仅附加的情况之一。 (类似于[假设[错误]假设只有pkg:lattice加载时grid.text应该可用。)这样就可以得到所需的字符串集:

library(rJava)
as.matrix(scan(text=.jcall(tree_rip$classifier, "S", "toString") ,sep="\n", what="") )[
                                                                       -c(1:2, 6), ,drop=FALSE]
#------------
     [,1]                                  
[1,] "(yield <= 48.8) => block=4 (5.0/2.0)"
[2,] "(yield <= 52) => block=5 (4.0/1.0)"  
[3,] " => block=3 (15.0/11.0)"