我在R中有一个如下所示的数据集:
DF <- data.frame(name=c("A","b","c","d","B","e","f"),
x=c(NA,1,2,3,NA,4,5))
我想重塑它:
rDF <- data.frame(name=c("b","c","d","e","f"),
x=c(1,2,3,4,5),
head=c("A","A","A","B","B"))
其中带有NA
的第一行标识新列,并将该“行值”带到具有NA
的下一行,然后更改“行值”。
我已经尝试了spread
和melt
,但它没有给我我想要的东西。
library(tidyr)
DF %>% spread(name,x)
library(reshape2)
melt(DF, id=c('name'))
有什么建议吗?
答案 0 :(得分:5)
这是一个可能的data.table
/ zoo
包裹组合解决方案
library(data.table) ; library(zoo)
setDT(DF)[is.na(x), head := name]
na.omit(DF[, head := na.locf(head)], "x")
# name x head
# 1: b 1 A
# 2: c 2 A
# 3: d 3 A
# 4: e 4 B
# 5: f 5 B
或者按照@Arun的建议,只使用data.table
na.omit(setDT(DF)[, head := name[is.na(x)], by=cumsum(is.na(x))])
答案 1 :(得分:3)
您可以尝试:
library(data.table)
library(magrittr)
split(DF, cumsum(is.na(DF$x))) %>%
lapply(function(u) transform(u[-1,], head=u[1,1])) %>%
rbindlist
# name x head
#1: b 1 A
#2: c 2 A
#3: d 3 A
#4: e 4 B
#5: f 5 B
答案 2 :(得分:3)
这是一种仅使用基本R函数的方法:
idx <- is.na(DF$x)
x <- rle(cumsum(idx))$lengths
DF$head <- rep(DF$name[idx], x)
DF[!idx,]
# name x head
#2 b 1 A
#3 c 2 A
#4 d 3 A
#6 e 4 B
#7 f 5 B