我有一个由数百个诊断代码组成的数据集。我打算将这些减少到更广泛的条件。例如,
dt <- data.frame(Diagnosis=c("A415","A419","B519","B589","T814"),Broader.Condition=NA)
是我当前数据的快照。我想我可以遍历每个诊断代码,检查它是否是我们感兴趣的,然后将更广泛的诊断输入到相应的列,这是我的尝试
for(i in 1:length(dt$Diagnosis)){
if(dt$Diagnosis[i] == "A415"||"A419"||"B519"||"B589"||"T814"){
dt$Broader.Condition[i] = "Skull and Face Fractures"}
但是我不相信我正在使用||
&#39;或者#39;语句正确引发
"Error in dt$Diagnosis[i] == "A415" || "A419" : invalid 'y' type in 'x'||'y'
对此有任何建议或只是&#39;或&#39;循环中的语句将不胜感激。我将使用多个&#39; if&#39;将其扩展到每个代码及其相应的更广泛的条件。我在&#39;中的陈述循环。
答案 0 :(得分:3)
如果要比较多个元素,最好使用%in%
返回逻辑vector
。使用它,我们将'Broader.Condition'中的元素分配给'Skull and Face Fractures'。
dt$Broader.Condition[dt$Diagnosis %in% values] <- "Skull and Face Fractures"
其中
values <- c("A415", "A419", "B519", "B589", "T814")
如果有更多值要替换,我们可以使用键/值数据集
kv <- data.frame(Diagnosis = c("A415", "A419", "B519", "T814", "B589"),
Value = c("Skull", "Face", "Skin", "Skull", "Face"), stringsAsFactors=FALSE)
dt$Broader.Condition <- kv$Value[match(dt$Diagnosis, kv$Diagnosis)]
答案 1 :(得分:1)
顺便说一句,你有不平衡的括号。我删除了所有括号,因为它们在这里没用。
方式1:
for(i in 1:length(dt$Diagnosis))
if(dt$Diagnosis[i] == "A415"||dt$Diagnosis[i] == "A419"||dt$Diagnosis[i] == "B519"||dt$Diagnosis[i] == "B589"||dt$Diagnosis[i] == "T814")
dt$Broader.Condition[i] = "Skull and Face Fractures"
方式2:
for(i in 1:length(dt$Diagnosis))
if(dt$Diagnosis[i] %in% c("A415","A419","B519","B589","T814"))
dt$Broader.Condition[i] = "Skull and Face Fractures"
方式3(注意:akrun比我发布消息提前4分钟发布了类似的解决方案)
is.skull.fractured = dt$Diagnosis %in% c("A415","A419","B519","B589","T814")
dt$Broader.Condition[is.skull.fractured] = "Skull and Face Fractures"