我有数据框
ID <- c(1,1,2,2,2,3,3)
x <- c("1st","","1st","1st","","","")
y <- c("2nd","2nd","","","","2nd","2nd")
z <- c("","","3rd","3rd","","","3rd")
df <- data.frame(ID,x,y,z)
df
ID x y z
1 1 1st 2nd
2 1 2nd
3 2 1st 3rd
4 2 1st 3rd
5 2
6 3 2nd
7 3 2nd 3rd
我希望按ID,输出
填充相同的值 ID x y z x1 y1 z1
1 1 1st 2nd 1st 2nd
2 1 2nd 1st 2nd
3 2 1st 3rd 1st 3rd
4 2 1st 3rd 1st 3rd
5 2 1st 3rd
6 3 2nd 2nd 3rd
7 3 2nd 3rd 2nd 3rd
如果ID 1具有1st,则新变量x1将具有ID1的全部“1st”,依此类推 如果我有更多变量,则更新数据,但我只需要使用x,y,z
ID <- c(1,1,2,2,2,3,3)
x <- c("1st","","1st","1st","","","")
y <- c("2nd","2nd","","","","2nd","2nd")
z <- c("","","3rd","3rd","","","3rd")
m <- c(10:16)
n <- c(20:26)
df <- data.frame(ID,x,y,z,m,n)
答案 0 :(得分:2)
这是一种利用tidyr::fill
的方法。如果您使用NA
而不是空字符串(一个好主意),这种方法将非常简单:
library(dplyr)
library(tidyr)
# add versions of x to z with NA instead of empty strings
df %>% mutate_at(vars(x:z), funs('1' = na_if(., ''))) %>%
# set grouping for following operations
group_by(ID) %>%
# for added columns, fill values downwards and upwards within each group
fill(x_1:z_1) %>% fill(x_1:z_1, .direction = 'up') %>%
# reinsert empty strings for NAs
mutate_at(vars(x_1:z_1), funs(coalesce(., factor(''))))
## Source: local data frame [7 x 9]
## Groups: ID [3]
##
## ID x y z m n x_1 y_1 z_1
## <dbl> <fctr> <fctr> <fctr> <int> <int> <fctr> <fctr> <fctr>
## 1 1 1st 2nd 10 20 1st 2nd
## 2 1 2nd 11 21 1st 2nd
## 3 2 1st 3rd 12 22 1st 3rd
## 4 2 1st 3rd 13 23 1st 3rd
## 5 2 14 24 1st 3rd
## 6 3 2nd 15 25 2nd 3rd
## 7 3 2nd 3rd 16 26 2nd 3rd
答案 1 :(得分:2)
使用data.table
稍微更直接的方法:
df = data.frame(ID, x, y, z, stringsAsFactors=FALSE)
require(data.table)
setDT(df)[, c("x1", "y1", "z1") := lapply(.SD, function(x) x[which.max(x != "")]), by = ID]
答案 2 :(得分:1)
我们可以使用dplyr
library(dplyr)
df %>%
group_by(ID) %>%
mutate_each(funs((.[.!=""][1]))) %>%
setNames(., c("ID", paste0(names(df)[-1], 1))) %>%
select(-ID) %>%
bind_cols(df, .)
#ID x y z ID x1 y1 z1
#1 1 1st 2nd 1 1st 2nd <NA>
#2 1 2nd 1 1st 2nd <NA>
#3 2 1st 3rd 2 1st <NA> 3rd
#4 2 1st 3rd 2 1st <NA> 3rd
#5 2 2 1st <NA> 3rd
#6 3 2nd 3 <NA> 2nd 3rd
#7 3 2nd 3rd 3 <NA> 2nd 3rd