我在R中的变量Name上合并了两个数据集。我需要一种方法来组合School.x和School.y列,以便NA字段由另一列填充。我认为R中有一些简单的方法可以做到这一点 - 任何人都有想法吗?
以下合并数据样本的图像(由于某种原因无法粘贴表格。)
如果你能弄明白如何制作它,那么如果School.x和School.y都有一个值,那么合并的列只会取得School.x的默认值。 (例如,如果Chloe在School.x中被列为林肯大学,而在School.y中被列为林肯高中,我只希望该专栏默认为林肯。)
非常感谢!
答案 0 :(得分:3)
以下代码应该可以解决问题。我将示例数据框的第三行修改为School.x
和School.y
都具有非NA值的情况,以展示您问题的奖金部分。
df <- data.frame(Name=c("Adam", "Bob", "Jane", "Bill", "Chloe", "Mandy"),
School.x=c("Hillcrest", NA, "Irvington", NA, "Lincoln", NA),
School.y=c(NA, "Star Academy", "Star Academy", "Mission Hills", NA,
"Washington Middle"),
stringsAsFactors=FALSE)
# choose default value from 'School.x' in the case that both 'School.x' and
# 'School.y' have values
df$merged[!is.na(df$School.x) & !is.na(df$School.y)] <-
df$School.x[!is.na(df$School.x) & !is.na(df$School.y)]
# replace NA values in 'School.x' with values from 'School.y' and vice-versa
df$merged[is.na(df$School.x)] <- df$School.y[is.na(df$School.x)]
df$merged[is.na(df$School.y)] <- df$School.x[is.na(df$School.y)]
> df
Name School.x School.y merged
1 Adam Hillcrest <NA> Hillcrest
2 Bob <NA> Star Academy Star Academy
3 Jane Irvington Star Academy Irvington
4 Bill <NA> Mission Hills Mission Hills
5 Chloe Lincoln <NA> Lincoln
6 Mandy <NA> Washington Middle Washington Middle