Question

我有两行数据帧，每个数据帧都有相同的列名。其中一个数据帧在一列或多列中具有NA值。我想删除其中一个数据框中包含NA值的列，并删除第二个数据框中的相同列。

样品：

数据框1：

age height education average
 NA  1.80   college    NA

数据框2：

age height education  average
 36  1.95   college     85

结果：

数据框1：

 height education
  1.80   college

数据框2：

height education
 1.95   college

我该怎么做？

Answer 1

听起来这些是数据帧，而不是矢量。如果将它们放在相同的数据框中（可能使用bind_rows()），则可以使用dplyr一次处理所有这些数据，并找到所需的列而不使用NA值：

library(dplyr)

df <- tribble(
    ~age, ~height, ~education, ~average,
      NA,    1.80,  "college",       NA,
      36,    1.95,  "college",       85
)

df %>% 
    select(which(!colSums(is.na(df))))

#> # A tibble: 2 x 2
#>   height education
#>    <dbl>     <chr>
#> 1   1.80   college
#> 2   1.95   college

基于列的两个数据帧之间的匹配

1 个答案: