我正在尝试在R
。
之前已经提出过类似的问题。 Insert a blank row after each group of data
在这种情况下,接受的答案也可以如下工作。
group <- c("a","b","b","c","c","c","d","d","d","d")
xvalue <- c(16:25)
yvalue <- c(1:10)
df <- data.frame(cbind(group,xvalue,yvalue))
df_new <- as.data.frame(lapply(df, as.character), stringsAsFactors = FALSE)
head(do.call(rbind, by(df_new, df$group, rbind, NA)), -1 )
group xvalue yvalue
a.1 a 16 1
a.2 <NA> <NA> <NA>
b.2 b 17 2
b.3 b 18 3
b.31 <NA> <NA> <NA>
c.4 c 19 4
c.5 c 20 5
c.6 c 21 6
c.41 <NA> <NA> <NA>
d.7 d 22 7
d.8 d 23 8
d.9 d 24 9
d.10 d 25 10
如何使用data.table
为大型data.frame?
答案 0 :(得分:8)
你可以尝试
df$group <- as.character(df$group)
setDT(df)[, .SD[1:(.N+1)], by=group][is.na(xvalue), group:=NA][!.N]
# group xvalue yvalue
#1: a 16 1
#2: NA NA NA
#3: b 17 2
#4: b 18 3
#5: NA NA NA
#6: c 19 4
#7: c 20 5
#8: c 21 6
#9: NA NA NA
#10: d 22 7
#11: d 23 8
#12: d 24 9
#13: d 25 10
或者@David Arenburg的建议
setDT(df)[, indx := group][, .SD[1:(.N+1)], indx][,indx := NULL][!.N]
或者
setDT(df)[df[,.I[1:(.N+1)], group]$V1][!.N]
或者可以根据@ eddi的评论进一步简化
setDT(df)[df[, c(.I, NA), group]$V1][!.N]
答案 1 :(得分:5)
我能想到的一种方法是首先按如下方式构造一个向量:
foo <- function(x) {
o = order(rep.int(seq_along(x), 2L))
c(x, rep.int(NA, length(x)))[o]
}
join_values = head(foo(unique(df_new$group)), -1L)
# [1] "a" NA "b" NA "c" NA "d"
然后setkey()
和join
。
setkey(setDT(df_new), group)
df_new[.(join_values), allow.cartesian=TRUE]
# group xvalue yvalue
# 1: a 16 1
# 2: NA NA NA
# 3: b 17 2
# 4: b 18 3
# 5: NA NA NA
# 6: c 19 4
# 7: c 20 5
# 8: c 21 6
# 9: NA NA NA
# 10: d 22 7
# 11: d 23 8
# 12: d 24 9
# 13: d 25 10