这绝对是一个新手问题,但我卡住了,找不到可比的在线帮助.. 我试图比较数据帧的两列来创建第三列。在这里,我想比较Distx和Disty。如果有任何值,我想保留它并将其放在新列Distz中。如果他们都是"失踪"我想把#34; Missing"在Distz。以下是我想要的数据框。
ID <- c(1, 2, 3, 4, 5, 6)
Distx <- c("A", "B", "Missing", "Missing", "G", "Missing")
Disty <- c("Missing", "Missing", "C", "Missing", "Missing", "E")
mydf <- data.frame(ID, Distx, Disty, Distz)
mydf
ID Distx Disty Distz
1 1 A Missing A
2 2 B Missing B
3 3 Missing C C
4 4 Missing Missing Missing
5 5 G Missing G
6 6 Missing E E
这是不起作用的代码......起初我以为我没有正确编制索引,但是下面的第二次代码尝试产生了相同的结果..没有错误消息,但结果是1&# 39; s,而不是列的实际值....?
for (i in seq(1:nrow(mydf))){
if (mydf$Distx[i] == "Missing" && mydf$Disty[i] != "Missing"){
mydf$Distz[i]<- mydf$Disty[i]}
if (mydf$Distx[i] != "Missing" && mydf$Disty[i] == "Missing"){
mydf$Distz[i]<- mydf$Distx[i]}
if (mydf$Distx[i] == "Missing" && mydf$Disty[i] == "Missing"){
mydf$Distz[i]<- "Missing"}
}
#for the purposes of readability I only ran two of the tests in this code
within(mydf, {
Distz <- ifelse(Distx == "Missing" & Disty != "Missing", Disty, ifelse(Distx != "Missing" & Disty == "Missing", Distx))
})
#Both results look like this ...???
ID Distx Disty Distz
1 1 A Missing 1
2 2 B Missing 1
3 3 Missing C 1
4 4 Missing Missing 1
5 5 G Missing 1
6 6 Missing E 1
提前感谢您提供任何帮助
答案 0 :(得分:1)
您可以尝试嵌套的ifelse
语句:
mydf$Distz <- with(mydf, ifelse(Distx == "Missing" & Disty == "Missing", "Missing",
ifelse(Distx != "Missing", as.character(Distx),
ifelse(Disty != "Missing", as.character(Disty), NA))))
mydf
# ID Distx Disty Distz
# 1 1 A Missing A
# 2 2 B Missing B
# 3 3 Missing C C
# 4 4 Missing Missing Missing
# 5 5 G Missing G
# 6 6 Missing E E
您遇到的代码问题是您的变量是&#34; factor&#34;上课,不是&#34;字符&#34;类,所以代码记录了因素&#34; level&#34;而不是因素标签。上面通过使用as.character()
来强制要素来解决这个问题。
答案 1 :(得分:1)
您也可以
indx <- mydf[-1]!='Missing'
mydf$Distz <- mydf[-1][cbind(1:nrow(mydf), max.col(indx))]
mydf
# ID Distx Disty Distz
#1 1 A Missing A
#2 2 B Missing B
#3 3 Missing C C
#4 4 Missing Missing Missing
#5 5 G Missing G
#6 6 Missing E E
注意:我使用的列是&#39;字符&#39;类。您可以创建&#39; data.frame&#39;使用stringsAsFactors=FALSE
以便&#39;字符&#39;列不会转换为&#39;因子&#39;类。最好与“角色”一起工作。而不是&#39;因素&#39;
mydf <- structure(list(ID = c(1, 2, 3, 4, 5, 6), Distx = c("A", "B",
"Missing", "Missing", "G", "Missing"), Disty = c("Missing", "Missing",
"C", "Missing", "Missing", "E")), .Names = c("ID", "Distx", "Disty"
), row.names = c(NA, -6L), class = "data.frame")