输入表
Patients Hospital Drug Response
1 AAA a Good
1 AAA a Bad
2 BBB a Bad
3 CCC b Good
4 CCC c Bad
5 DDD e undefined
输出文件
Patients Hospital Drug Response
1 AAA a 1
1 AAA a -1
2 BBB a -1
3 CCC b 1
4 CCC c -1
5 DDD e
如何将一列中的3个文本替换为数字和空白?
“好在Reponse栏”到“1” “在Reponse栏中不好”到“-1” “在Reponse栏中未定义”至“”
数据:
structure(list(Patients = c(1L, 1L, 2L, 3L, 4L, 5L), Hospital = structure(c(1L,
1L, 2L, 3L, 3L, 4L), .Label = c("AAA", "BBB", "CCC", "DDD"), class = "factor"),
Drug = structure(c(1L, 1L, 1L, 2L, 3L, 4L), .Label = c("a",
"b", "c", "e"), class = "factor"), Response = structure(c(2L,
1L, 1L, 2L, 1L, 3L), .Label = c("Bad", "Good", "undefined"
), class = "factor")), .Names = c("Patients", "Hospital",
"Drug", "Response"), class = "data.frame", row.names = c(NA,
-6L))
答案 0 :(得分:16)
您可以通过更改因子Response
的标签来执行此操作:
> within(df, Response <- factor(Response, labels = c(-1, 1, "")))
Patients Hospital Drug Response
1 1 AAA a 1
2 1 AAA a -1
3 2 BBB a -1
4 3 CCC b 1
5 4 CCC c -1
6 5 DDD e
答案 1 :(得分:5)
凯瑟琳,您的问题仍然可以通过R中的一本非常基本的教科书来回答。请参阅Dirk在previous question中的评论。
<强>答案强>
如果d
是您的数据框,则:
d[d$Response == "Good",]$Response = 1
d[d$Response == "Bad",]$Response = -1
d[d$Response == "undefined",]$Response = ""
我猜(我可能错了)“Undefined”缺少数据。在这种情况下,请使用NA
而不是空白。任何基本的R书都会描述NA
的
答案 2 :(得分:2)
如果您的数据位于数据框df
df$Response[df$Response == "Good"] <- 1
df$Response[df$Response == "Bad"] <- -1
df$Response[df$Response == "undefined"] <- ""
答案 3 :(得分:2)
您可以使用简单的ifelse()
语句。
cath <- data.frame(nmbrs = runif(10), words = sample(c("good", "bad"), 10, replace = TRUE))
cath$words <- ifelse(cath$words == "good", 1, ifelse(cath$words == "bad", -1, ""))