合并r中不包含NA值的列

时间:2019-06-04 04:20:26

标签: r dataframe

我有以下数据文件:

dt.1

router.get('/admin/main-category',(req,res)=>{
    mainCat.find({})
        .then(list=>{
            res.render('admin/maincategory/list',{data:list});
        });
});

dt.2

Father  Daughter
Peter     1
Josh      3
Cold      4
NA .      5
NA .      6
NA .      7

我想合并除NA值之外的其他值。我需要这个:

Father  Weight
Peter     10
Josh      33
Cold      44
NA .      55
NA .      65
NA .      77

我尝试了普通合并:

Father       Weight    Daughter
    Peter     10         1
    Josh      33         2
    Cold      44         3
    NA .      55         NA
    NA .      65         NA
    NA .      77         NA
    NA        NA          5
    NA        NA          6
    NA        AN          7

但是不起作用,新文件给了我更多的行。因此,我只考虑实际值就合并。

2 个答案:

答案 0 :(得分:1)

分别filter“父亲”中没有NA个元素的数据集,进行full_join并将行与其他NA行绑定

library(tidyverse)
dt1 %>% 
  filter(is.na(Father)) %>%
  bind_rows(dt2 %>% 
                filter(is.na(Father))) %>%
  bind_rows(full_join(dt1 %>% 
                        filter(!is.na(Father)),
                      dt2 %>% filter(!is.na(Father))))%>% 
  arrange(is.na(Father), is.na(Weight)) %>% 
  select(Father, Weight, Daughter)
#   Father Weight Daughter
#1  Peter     10        1
#2   Josh     33        3
#3   Cold     44        4
#4   <NA>     55       NA
#5   <NA>     65       NA
#6   <NA>     77       NA
#7   <NA>     NA        5
#8   <NA>     NA        6
#9   <NA>     NA        7

或者另一种选择是通过split的存在来NAs并加入逻辑条件

map2_df(split(dt1, is.na(dt1$Father)), split(dt2, is.na(dt2$Father)),
     ~ if(all(is.na(.x$Father))) bind_rows(.x, .y) else full_join(.x, .y))
#   Father Daughter Weight
#1  Peter        1     10
#2   Josh        3     33
#3   Cold        4     44
#4   <NA>        5     NA
#5   <NA>        6     NA
#6   <NA>        7     NA
#7   <NA>       NA     55
#8   <NA>       NA     65
#9   <NA>       NA     77

数据

dt1 <- structure(list(Father = c("Peter", "Josh", "Cold", NA, NA, NA
), Daughter = c(1L, 3L, 4L, 5L, 6L, 7L)), class = "data.frame", 
row.names = c(NA, 
-6L))

dt2 <- structure(list(Father = c("Peter", "Josh", "Cold", NA, NA, NA
), Weight = c(10L, 33L, 44L, 55L, 65L, 77L)), class = "data.frame",
row.names = c(NA, 
-6L))

答案 1 :(得分:1)

使用dplyr和tidyr,您可以用占位符替换df1和df2中的NA,并加入数据框,然后将占位符转换回NA s:

library(dplyr)
library(tidyr)

replace_na(df1, list(Father = "NA1")) %>% 
    full_join(replace_na(df2, list(Father = "NA2"))) %>% 
    mutate(Father = sub("NA.*", NA, Father))

#### OUTPUT ####

 Father Daughter Weight
1  Peter        1     10
2   Josh        3     33
3   Cold        4     44
4   <NA>        5     NA
5   <NA>        6     NA
6   <NA>        7     NA
7   <NA>       NA     55
8   <NA>       NA     65
9   <NA>       NA     77

使用基数R,您可以先合并不带NA的数据帧的各个部分,然后再合并rbind个具有NA的部分:

df3 <- merge(subset(df1, !is.na(Father)), df2, by = "Father")
df1$Weight <- df2$Daughter <- NA
rbind(df_final, subset(df2, is.na(Father)), subset(df1, is.na(Father)))

#### OUTPUT ####

   Father Daughter Weight
1    Cold        4     44
2    Josh        3     33
3   Peter        1     10
4    <NA>       NA     55
5    <NA>       NA     65
6    <NA>       NA     77
41   <NA>        5     NA
51   <NA>        6     NA
61   <NA>        7     NA