我已经导出了Survey Monkey数据,对于每个问题,该数据都会为每个选项生成一个单独的列,并且如果响应者选择了此响应,则将其填充一个字符值,否则为NA
(请参见下面的df)。
我想基于多个列的相同条件创建一个新的二进制列。
diag <- structure(list(diag_stress_fracture = c(NA, "Stress
fracture(s)",
NA, NA, NA, NA), diag_disordered_eating = c(NA_character_,
NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_),
diag_asthma = c(NA, "Asthma", NA, NA, NA, NA),
diag_low_bone_density = c(NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_,
NA_character_), diag_acl_rupture = c(NA_character_, NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_
), diag_concussion = c(NA, "Concussion", NA, NA, NA, NA),
diag_depression_or_anxiety = c(NA_character_, NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_
), diag_haemochromatosis = c(NA_character_, NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_
), diag_hypothyroidism = c(NA_character_, NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_
), diag_oligomenorrhea_or_amenorrhoea = c(NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_,
NA_character_)), .Names = c("diag_stress_fracture",
"diag_disordered_eating",
"diag_asthma", "diag_low_bone_density", "diag_acl_rupture",
"diag_concussion",
"diag_depression_or_anxiety", "diag_haemochromatosis",
"diag_hypothyroidism",
"diag_oligomenorrhea_or_amenorrhoea"), row.names = c(NA, 6L), class
= "data.frame")`
基本上,我想知道参与者是否有诊断,无论它是什么。我可以使用以下代码获得期望的结果(其中...
是上面感兴趣的列,但在本示例中已被截断):
diag <- diag %>%
mutate(diag.yn = ifelse(!is.na(diag_stress_fracture) |
!is.na(diag_disordered_eating) |
!is.na(diag_asthma) | ... , 1, 0)
但是,鉴于我想针对多个问题进行此操作,因此这显然非常笨拙且耗时。有没有办法使用列位置来执行此操作这些是我的大型数据集中的38:47吗?