合并R中的两列,用值填充NA

时间:2015-11-17 06:05:54

标签: r merge

我在R中的变量Name上合并了两个数据集。我需要一种方法来组合School.x和School.y列,以便NA字段由另一列填充。我认为R中有一些简单的方法可以做到这一点 - 任何人都有想法吗?

以下合并数据样本的图像(由于某种原因无法粘贴表格。)

enter image description here

如果你能弄明白如何制作它,那么如果School.x和School.y都有一个值,那么合并的列只会取得School.x的默认值。 (例如,如果Chloe在School.x中被列为林肯大学,而在School.y中被列为林肯高中,我只希望该专栏默认为林肯。)

非常感谢!

1 个答案:

答案 0 :(得分:3)

以下代码应该可以解决问题。我将示例数据框的第三行修改为School.xSchool.y都具有非NA值的情况,以展示您问题的奖金部分。

df <- data.frame(Name=c("Adam", "Bob", "Jane", "Bill", "Chloe", "Mandy"),
                 School.x=c("Hillcrest", NA, "Irvington", NA, "Lincoln", NA),
                 School.y=c(NA, "Star Academy", "Star Academy", "Mission Hills", NA,
                            "Washington Middle"),
                 stringsAsFactors=FALSE)


# choose default value from 'School.x' in the case that both 'School.x' and
# 'School.y' have values
df$merged[!is.na(df$School.x) & !is.na(df$School.y)] <-
    df$School.x[!is.na(df$School.x) & !is.na(df$School.y)]

# replace NA values in 'School.x' with values from 'School.y' and vice-versa
df$merged[is.na(df$School.x)] <- df$School.y[is.na(df$School.x)]
df$merged[is.na(df$School.y)] <- df$School.x[is.na(df$School.y)]

> df
   Name  School.x          School.y            merged
1  Adam Hillcrest              <NA>         Hillcrest
2   Bob      <NA>      Star Academy      Star Academy
3  Jane Irvington      Star Academy         Irvington
4  Bill      <NA>     Mission Hills     Mission Hills
5 Chloe   Lincoln              <NA>           Lincoln
6 Mandy      <NA> Washington Middle Washington Middle