用该行业的最后几周价格替换所有空白,我该如何在R中做到这一点。请有人帮忙提供快速解决方案。任何建议都将受到高度赞赏。
输入数据帧为
sec date price
sec11 6/1/2019 309
sec12 7/1/2019 412
sec13 8/1/2019 206
sec14 9/1/2019 103
sec15 10/1/2019 257.5
sec16 11/1/2019 803.4
sec17 12/1/2019 638.6
sec11 13/1/2019 300
sec12 14/1/2019 400
sec13 15/1/2019 200
sec14 16/1/2019 100
sec15 17/1/2019 250
sec16 18/1/2019 780
sec17 19/1/2019 620
sec11 20/1/2019
sec12 21/1/2019
sec13 22/1/2019
sec14 23/1/2019
sec15 24/1/2019
sec16 25/1/2019
sec17 26/1/2019
输出:
sec date price
sec11 6/1/2019 309
sec12 7/1/2019 412
sec13 8/1/2019 206
sec14 9/1/2019 103
sec15 10/1/2019 257.5
sec16 11/1/2019 803.4
sec17 12/1/2019 638.6
sec11 13/1/2019 300
sec12 14/1/2019 400
sec13 15/1/2019 200
sec14 16/1/2019 100
sec15 17/1/2019 250
sec16 18/1/2019 780
sec17 19/1/2019 620
sec11 20/1/2019 300
sec12 21/1/2019 400
sec13 22/1/2019 200
sec14 23/1/2019 100
sec15 24/1/2019 250
sec16 25/1/2019 780
sec17 26/1/2019 620
答案 0 :(得分:3)
tidyr::fill
与dplyr::group_by
这很简单,fill
与group_by
中的tidyverse
library(tidyverse)
df %>%
group_by(sec) %>%
fill(price) %>%
ungroup()
#
## A tibble: 21 x 3
# sec date price
# <fct> <fct> <dbl>
# 1 sec11 6/1/2019 309
# 2 sec11 13/1/2019 300
# 3 sec11 20/1/2019 300
# 4 sec12 7/1/2019 412
# 5 sec12 14/1/2019 400
# 6 sec12 21/1/2019 400
# 7 sec13 8/1/2019 206
# 8 sec13 15/1/2019 200
# 9 sec13 22/1/2019 200
#10 sec14 9/1/2019 103
## ... with 11 more rows
在上面的输出中,对行进行了重新排序,因此,要确保确实能够再现预期的输出,我们可以添加行号并按原始行号对最终输出进行排序
df %>%
rowid_to_column() %>%
group_by(sec) %>%
fill(price) %>%
ungroup() %>%
arrange(rowid) %>%
select(-rowid) %>%
as.data.frame()
# sec date price
#1 sec11 6/1/2019 309.0
#2 sec12 7/1/2019 412.0
#3 sec13 8/1/2019 206.0
#4 sec14 9/1/2019 103.0
#5 sec15 10/1/2019 257.5
#6 sec16 11/1/2019 803.4
#7 sec17 12/1/2019 638.6
#8 sec11 13/1/2019 300.0
#9 sec12 14/1/2019 400.0
#10 sec13 15/1/2019 200.0
#11 sec14 16/1/2019 100.0
#12 sec15 17/1/2019 250.0
#13 sec16 18/1/2019 780.0
#14 sec17 19/1/2019 620.0
#15 sec11 20/1/2019 300.0
#16 sec12 21/1/2019 400.0
#17 sec13 22/1/2019 200.0
#18 sec14 23/1/2019 100.0
#19 sec15 24/1/2019 250.0
#20 sec16 25/1/2019 780.0
#21 sec17 26/1/2019 620.0
zoo::na.locf
,其基数为R的ave
library(zoo)
transform(df, price = ave(price, sec, FUN = function(x) na.locf(x)))
# sec date price
#1 sec11 6/1/2019 309.0
#2 sec12 7/1/2019 412.0
#3 sec13 8/1/2019 206.0
#4 sec14 9/1/2019 103.0
#5 sec15 10/1/2019 257.5
#6 sec16 11/1/2019 803.4
#7 sec17 12/1/2019 638.6
#8 sec11 13/1/2019 300.0
#9 sec12 14/1/2019 400.0
#10 sec13 15/1/2019 200.0
#11 sec14 16/1/2019 100.0
#12 sec15 17/1/2019 250.0
#13 sec16 18/1/2019 780.0
#14 sec17 19/1/2019 620.0
#15 sec11 20/1/2019 300.0
#16 sec12 21/1/2019 400.0
#17 sec13 22/1/2019 200.0
#18 sec14 23/1/2019 100.0
#19 sec15 24/1/2019 250.0
#20 sec16 25/1/2019 780.0
#21 sec17 26/1/2019 620.0
df <- read.table(text =
"sec date price
sec11 6/1/2019 309
sec12 7/1/2019 412
sec13 8/1/2019 206
sec14 9/1/2019 103
sec15 10/1/2019 257.5
sec16 11/1/2019 803.4
sec17 12/1/2019 638.6
sec11 13/1/2019 300
sec12 14/1/2019 400
sec13 15/1/2019 200
sec14 16/1/2019 100
sec15 17/1/2019 250
sec16 18/1/2019 780
sec17 19/1/2019 620
sec11 20/1/2019 ''
sec12 21/1/2019 ''
sec13 22/1/2019 ''
sec14 23/1/2019 ''
sec15 24/1/2019 ''
sec16 25/1/2019 ''
sec17 26/1/2019 '' ", header = T)