有一个数据框“closeValues” 它如下
>closeValues
date value
1 1980-12-10 5
2 1980-12-15 8
3 1980-12-18 7
4 1980-12-20 1
但是如果缺少“日期”,我需要用前一个值填充字段“value”的值。 其实我需要以下输出
>closeValues
date value
1 1980-12-10 5
2 1980-12-11 5
3 1980-12-12 5
4 1980-12-13 5
5 1980-12-14 5
6 1980-12-15 8
7 1980-12-16 8
8 1980-12-17 8
9 1980-12-18 7
10 1980-12-19 7
11 1980-12-20 1
是否可以在R?
答案 0 :(得分:3)
使用na.locf
包中的zoo
,您可以执行以下操作:
dat1 <- data.frame(date = seq(as.Date('1980-12-10'),as.Date('1980-12-20'),1))
## the merge will fill dat1 with NA, and na.locf do the rest
na.locf(zoo(merge(dat1,dat,all.x=T)))
date value
1 1980-12-10 5
2 1980-12-11 5
3 1980-12-12 5
4 1980-12-13 5
5 1980-12-14 5
6 1980-12-15 8
7 1980-12-16 8
8 1980-12-17 8
9 1980-12-18 7
10 1980-12-19 7
11 1980-12-20 1
PS 请在下次提供可重现的示例。俞可写这个:
dat <- data.frame(date = as.Date(c('1980-12-10','1980-12-15',
'1980-12-18','1980-12-20')),
value=c(5,8,7,1))
或
dput(dat)
structure(list(date = structure(c(3996, 4001, 4004, 4006), class = "Date"),
value = c(5, 8, 7, 1)), .Names = c("date", "value"), row.names = c(NA,
-4L), class = "data.frame")
答案 1 :(得分:1)
这可能会在基础R中做你想要的:
df.1 <- read.table(text='
DATE VALUE
1980-12-10 5
1980-12-15 8
1980-12-18 7
1980-12-20 1', header=T, colClasses=c('character', 'numeric'))
df.1$DATE2 <- as.Date(df.1$DATE)
df.1$diffs <- c(as.numeric(diff(df.1$DATE2)),1)
df.2 <- df.1[rep(1:nrow(df.1),df.1$diffs),]
df.2$running.count = sequence(rle(df.2$VALUE)$lengths)
df.2$DATE3 <- df.2$DATE2 + (df.2$running.count-1)
df.2
# DATE VALUE DATE2 diffs running.count DATE3
# 1 1980-12-10 5 1980-12-10 5 1 1980-12-10
# 1.1 1980-12-10 5 1980-12-10 5 2 1980-12-11
# 1.2 1980-12-10 5 1980-12-10 5 3 1980-12-12
# 1.3 1980-12-10 5 1980-12-10 5 4 1980-12-13
# 1.4 1980-12-10 5 1980-12-10 5 5 1980-12-14
# 2 1980-12-15 8 1980-12-15 3 1 1980-12-15
# 2.1 1980-12-15 8 1980-12-15 3 2 1980-12-16
# 2.2 1980-12-15 8 1980-12-15 3 3 1980-12-17
# 3 1980-12-18 7 1980-12-18 2 1 1980-12-18
# 3.1 1980-12-18 7 1980-12-18 2 2 1980-12-19
# 4 1980-12-20 1 1980-12-20 1 1 1980-12-20