使用指标变量指定R中的回归

时间:2013-01-28 19:23:15

标签: r formula regression

我想在R中指定一个回归来估计x上以第三个变量z为条件的系数大于0的系数。例如

y ~ a + x*1(z>0) + x*1(z<=0)

使用公式在R中执行此操作的正确方法是什么?

2 个答案:

答案 0 :(得分:10)

“:”(冒号)运算符用于构造条件交互(当与使用I构造的不相交预测变量一起使用时)。应与预测一起使用

> y=rnorm(10)
> x=rnorm(10)
> z=rnorm(10)
> mod <- lm(y ~ x:I(z>0) )
> mod

Call:
lm(formula = y ~ x:I(z > 0))

Coefficients:
    (Intercept)  x:I(z > 0)FALSE   x:I(z > 0)TRUE  
      -0.009983        -0.203004        -0.655941  

> predict(mod, newdata=data.frame(x=1:10, z=c(-1, 1)) )
         1          2          3          4          5          6          7 
-0.2129879 -1.3218653 -0.6189968 -2.6337471 -1.0250057 -3.9456289 -1.4310147 
         8          9         10 
-5.2575108 -1.8370236 -6.5693926 
> plot(1:10, predict(mod, newdata=data.frame(x=1:10, z=c(-1)) )  )
> lines(1:10, predict(mod, newdata=data.frame(x=1:10, z=c(1)) ) )

可能有助于查看其模型矩阵:

> model.matrix(mod)
   (Intercept) x:I(z > 0)FALSE x:I(z > 0)TRUE
1            1      -0.2866252     0.00000000
2            1       0.0000000    -0.03197743
3            1      -0.7427334     0.00000000
4            1       2.0852202     0.00000000
5            1       0.8548904     0.00000000
6            1       0.0000000     1.00044600
7            1       0.0000000    -1.18411791
8            1       0.0000000    -1.54110256
9            1       0.0000000    -0.21173300
10           1       0.0000000     0.17035257
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")$`I(z > 0)`
[1] "contr.treatment"

答案 1 :(得分:2)

  y <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
  z <- sample(x=-10:10,size=length(trt),replace=T)
  x <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
  a <- rnorm(n=length(x))
  lm(y~a+I(x*1*I(z>0))+ I(x*1*I(z<=0)))

但我认为在DWIN解决方案中使用:运算符更优雅..

修改

lm(y~a + I(x * 1 * I(z> 0))+ I(x * 1 * I(z <= 0)))

呼叫:

lm(formula = y ~ a + I(x * 1 * I(z > 0)) + I(x * 1 * I(z <= 0)))

Coefficients:
         (Intercept)                     a   I(x * 1 * I(z > 0))  I(x * 1 * I(z <= 0))  
              6.5775               -0.1345               -0.3352               -0.3366  

> lm(formula = y ~ a+ x:I(z > 0))

Call:
lm(formula = y ~ a + x:I(z > 0))

Coefficients:
    (Intercept)                a  x:I(z > 0)FALSE   x:I(z > 0)TRUE  
         6.5775          -0.1345          -0.3366          -0.3352