R - 根据id填充空白值

时间:2015-10-03 20:15:37

标签: r dplyr

小问题:我想根据id2的分组填写id的值。

从此

> head(dta)
    id     id2
1 B10388W4       0
2 B10388W4 B10388W
3 B10388W4 B10388W

只是那个

    id     id2
1 B10388W4 B10388W
2 B10388W4 B10388W
3 B10388W4 B10388W

根据小组id填写值的简洁方法是什么?

我想到了像

这样的东西
dta %>% 
  group_by(id) %>% 
  mutate( id3 = ifelse(id2 == 0, lead(id2), id2) )

但它并不那么聪明,因为在同一id的其他地方可以找到0。

欢迎任何想法。

数据

dta = structure(list(id = c("B10388W4", "B10388W4", "B10388W4"), 
id2 = c("0", "B10388W", "B10388W")), row.names = c(NA, -3L), 
class = "data.frame", .Names = c("id", "id2"))

2 个答案:

答案 0 :(得分:0)

对于每个id(群组),您可以使用不是id2的唯一0值,并更新id2列。这假设对于每个唯一的id值,只有一个唯一的id2值,也可能是0值。

# my example dataset
dt = data.frame(id = c("B10388W4","B10388W4","B10388W4","A10388W4","A10388W4","A10388W4"),
                id2 = c(0,"B10388W","B10388W","A10388W",0,0),
                stringsAsFactors = F)

dt

#         id     id2
# 1 B10388W4       0
# 2 B10388W4 B10388W
# 3 B10388W4 B10388W
# 4 A10388W4 A10388W
# 5 A10388W4       0
# 6 A10388W4       0


library(dplyr)

dt %>% 
group_by(id) %>%
mutate(id2_new = unique(id2[id2 != 0])) %>%
select(-id2) %>%
ungroup

#          id id2_new
#       (chr)   (chr)
# 1 A10388W4 A10388W
# 2 A10388W4 A10388W
# 3 A10388W4 A10388W
# 4 B10388W4 B10388W
# 5 B10388W4 B10388W
# 6 B10388W4 B10388W

答案 1 :(得分:0)

dta %>%
  filter(id2 != 0) %>%
  distinct %>%
  full_join(dta %>% select(id1) )