我想在R中指定一个回归来估计x
上以第三个变量z
为条件的系数大于0的系数。例如
y ~ a + x*1(z>0) + x*1(z<=0)
使用公式在R中执行此操作的正确方法是什么?
答案 0 :(得分:10)
“:”(冒号)运算符用于构造条件交互(当与使用I
构造的不相交预测变量一起使用时)。应与预测一起使用
> y=rnorm(10)
> x=rnorm(10)
> z=rnorm(10)
> mod <- lm(y ~ x:I(z>0) )
> mod
Call:
lm(formula = y ~ x:I(z > 0))
Coefficients:
(Intercept) x:I(z > 0)FALSE x:I(z > 0)TRUE
-0.009983 -0.203004 -0.655941
> predict(mod, newdata=data.frame(x=1:10, z=c(-1, 1)) )
1 2 3 4 5 6 7
-0.2129879 -1.3218653 -0.6189968 -2.6337471 -1.0250057 -3.9456289 -1.4310147
8 9 10
-5.2575108 -1.8370236 -6.5693926
> plot(1:10, predict(mod, newdata=data.frame(x=1:10, z=c(-1)) ) )
> lines(1:10, predict(mod, newdata=data.frame(x=1:10, z=c(1)) ) )
可能有助于查看其模型矩阵:
> model.matrix(mod)
(Intercept) x:I(z > 0)FALSE x:I(z > 0)TRUE
1 1 -0.2866252 0.00000000
2 1 0.0000000 -0.03197743
3 1 -0.7427334 0.00000000
4 1 2.0852202 0.00000000
5 1 0.8548904 0.00000000
6 1 0.0000000 1.00044600
7 1 0.0000000 -1.18411791
8 1 0.0000000 -1.54110256
9 1 0.0000000 -0.21173300
10 1 0.0000000 0.17035257
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")$`I(z > 0)`
[1] "contr.treatment"
答案 1 :(得分:2)
y <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
z <- sample(x=-10:10,size=length(trt),replace=T)
x <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
a <- rnorm(n=length(x))
lm(y~a+I(x*1*I(z>0))+ I(x*1*I(z<=0)))
但我认为在DWIN解决方案中使用:
运算符更优雅..
修改强>
lm(y~a + I(x * 1 * I(z> 0))+ I(x * 1 * I(z <= 0)))
呼叫:
lm(formula = y ~ a + I(x * 1 * I(z > 0)) + I(x * 1 * I(z <= 0)))
Coefficients:
(Intercept) a I(x * 1 * I(z > 0)) I(x * 1 * I(z <= 0))
6.5775 -0.1345 -0.3352 -0.3366
> lm(formula = y ~ a+ x:I(z > 0))
Call:
lm(formula = y ~ a + x:I(z > 0))
Coefficients:
(Intercept) a x:I(z > 0)FALSE x:I(z > 0)TRUE
6.5775 -0.1345 -0.3366 -0.3352