根据数据
ds
,根据对象ds$h1
中指定的协调规则,从ds$raw1
和ds$raw2
计算新变量hrule
。
可重复的示例包含10个人对raw1
和raw2
的两个指标的回复:
>ds
id raw1 raw2
1 1 1 1
2 2 1 0
3 3 0 1
4 4 0 0
5 5 NA 1
6 6 NA 0
7 7 1 NA
8 8 0 NA
9 9 NA NA
10 10 1 1
根据某些规则(定性开发),需要将这两个变量转换为单个协调变量。协调变换规则编码在对象hrule
:
>hrule
raw1 raw2 h1
1 0 0 0
2 0 1 1
3 0 NA 0
4 1 0 1
5 1 1 1
6 1 NA 1
7 NA 0 0
8 NA 1 1
9 NA NA NA
因此,应该将第1行的规则读作:
如果受访者在
0
上提供raw1
的值,在0
上提供raw2
的值,则h1
的值应为0
}。
开发一个函数,传递
ds
,hrule
,变量名称,字符向量(c("raw1","raw2")
),以及协调变量的名称("h1"
)和输出一个新的协调变量(ds$h1
)。
(ds <- data.frame("id" = 1:10,
"raw1" = c(1,1,0,0,NA,NA,1 ,0 ,NA,1),
"raw2" = c(1,0,1,0,1 ,0 ,NA,NA,NA,1)))
(response_profile <- ds %>% dplyr::group_by(raw1, raw2) %>% dplyr::summarize(count=n()))
(hrule <- cbind(response_profile, "h1" = c(0,1,0,1,1,1,0,1,NA)))
new_function <- function(ds, hrule,
variable_names, # variable_names = c("raw1,"raw2"), the number will vary
harmony_name # harmony_name = "h1", there might be "h2"
){
}
提前感谢您的想法!
答案 0 :(得分:0)
这是@Symbolix建议的完整解决方案
rm(list=ls(all=TRUE)) #Clear the memory of variables from previous run. This is not called by knitr, because it's above the first chunk.
cat("\f")
library(magrittr)
(ds <- data.frame("id" = 1:10,
"raw1" = c(1,1,0,0,NA,NA,1 ,0 ,NA,1),
"raw2" = c(1,0,1,0,1 ,0 ,NA,NA,NA,1)))
response_profile <- ds %>% dplyr::group_by(raw1, raw2) %>% dplyr::summarize(count=n()) %>% dplyr::select(-count)
(hrule <- cbind(response_profile,
"h1" = c(0,1,0 ,1,1,1 ,0 ,1 ,NA), # at least one 1 to produce 1
"h2"= c(0,0,NA,0,1,NA,NA,NA,NA) # both must be 1
))
recode_from_meta <- function(ds, hrule, variable_names, harmony_name){
d <- merge(ds, hrule[, c(variable_names, harmony_name)], by=variable_names, all.x=T)
}
> hrule
raw1 raw2 h1 h2
1 0 0 0 0
2 0 1 1 0
3 0 NA 0 NA
4 1 0 1 0
5 1 1 1 1
6 1 NA 1 NA
7 NA 0 0 NA
8 NA 1 1 NA
9 NA NA NA NA
> (d <- recode_from_meta(ds, hrule,variable_names=c("raw1", "raw2"), harmony_name="h1"))
raw1 raw2 id h1
1 0 0 4 0
2 0 1 3 1
3 0 NA 8 0
4 1 0 2 1
5 1 1 1 1
6 1 1 10 1
7 1 NA 7 1
8 NA 0 6 0
9 NA 1 5 1
10 NA NA 9 NA
> (d <- recode_from_meta(ds, hrule,variable_names=c("raw1", "raw2"), harmony_name="h2"))
raw1 raw2 id h2
1 0 0 4 0
2 0 1 3 0
3 0 NA 8 NA
4 1 0 2 0
5 1 1 1 1
6 1 1 10 1
7 1 NA 7 NA
8 NA 0 6 NA
9 NA 1 5 NA
10 NA NA 9 NA