Question

我正在使用来自R的聚会包中的ctree()。我希望能够使用多个数据帧中的列，我将其分别称为使用列（使用$）过去有这个功能，但这次它没有用。

为了说明错误，我将一个样本数据集组合为一个数据帧。我跑的时候：

>ctree(data$adult_age~data$child_age+data$freq)

我收到以下错误：

>Error in model.frame.default(formula = ~data$adult_age, data = list(),  : 
  invalid type (NULL) for variable 'data$adult_age'

如果我像这样运行它，它可以工作：

>ctree(adult_age~child_age+freq, data)

通常这两种写出方式是可以互换的（例如lm()我得到的结果都相同），但是ctree()我遇到了错误。为什么？我该如何解决这个问题，以便我可以一次性从不同的数据帧中提取它们？

我的数据结构如下：

> dput(data)

>structure(list(adult_age = c(38, 38, 38, 38, 38, 55.5, 55.5, 38, 38, 38), child_age = c(8, 8, 13, 3.5, 3.5, 13, 8, 8, 8, 13), freq = c(0.1, 12, 0.1, 0.1, 0.1, 0.1, 1, 2, 0.1, 0.1)), .Names = c("adult_age", "child_age", "freq"), class = "data.frame", row.names = c(12L, 13L, 14L, 15L, 18L, 20L, 22L, 23L, 24L, 25L))

如果要运行示例数据：

>adult_age = c(38, 38, 38, 38, 38, 55.5, 55.5, 38, 38, 38)

>child_age = c(8, 8, 13, 3.5, 3.5, 13, 8, 8, 8, 13)

>freq = c(0.1, 12, 0.1, 0.1, 0.1, 0.1, 1, 2, 0.1, 0.1)

>data=as.data.frame(cbind(adult_age, child_age, freq))

Answer 1

为什么不应用此方法

从不在模型公式中使用data$（正如@Roland已经指出的那样）。除了您不必要地重复数据名称并且必须输入更多内容之外，它还是混淆和错误的来源。如果您还没有遇到此问题，那么，使用lm()，您还没有使用predict()。考虑一下data的简单线性回归：

m1 <- lm(adult_age ~ child_age, data = data)
m2 <- lm(data$adult_age ~ data$child_age)
coef(m1) - coef(m2)
## (Intercept)   child_age 
##           0           0

因此，两种方法都会导致相同的系数估计等。但是在您希望将相同公式与不同/更新/子集化数据一起使用的所有情况下，您都会遇到麻烦。突出显示，在predict()中，例如，在child_age = 0进行预测时。正确分离公式和数据的预期用法可以恢复截距：

predict(m1, newdata = data.frame(child_age = 0))
##        1 
## 36.38919 
coef(m1)[1]
## (Intercept) 
##    36.38919

但对于data$版本，实际预测中根本不使用newdata：

predict(m2, newdata = data.frame(child_age = 0))
##        1        2        3        4        5        6        7        8 
## 41.14343 41.14343 44.11483 38.46917 38.46917 44.11483 41.14343 41.14343 
##        9       10 
## 41.14343 44.11483 
## Warning message:
## 'newdata' had 1 row but variables found have 10 rows

还有更多这样的例子。但是这个应该严肃到足以避免这种情况。

如何将其应用于`ctree()`

如果您决定采用data$方法射击自己，可以使用ctree()包中新{（推荐）的partykit实施方式。使用标准的非标准评估重写了整个公式/数据处理。

library("partykit")
ctree(adult_age ~ child_age + freq, data = data)
## Model formula:
## adult_age ~ child_age + freq
## 
## Fitted party:
## [1] root: 41.500 (n = 10, err = 490.0) 
## 
## Number of inner nodes:    0
## Number of terminal nodes: 1
ctree(data$adult_age ~ data$child_age + data$freq)
## Model formula:
## data$adult_age ~ data$child_age + data$freq
## 
## Fitted party:
## [1] root: 41.500 (n = 10, err = 490.0) 
## 
## Number of inner nodes:    0
## Number of terminal nodes: 1

如何使用R中的party包修复ctree函数中的“void type（NULL）for variable”错误？

1 个答案:

为什么不应用此方法

如何将其应用于`ctree()`

如何使用R中的party包修复ctree函数中的“void type（NULL）for variable”错误？

1 个答案:

为什么不应用此方法

如何将其应用于ctree()

如何将其应用于`ctree()`