如何在dplyr中使用fill.by_function()和na.approx()[线性插值]

时间:2017-04-10 20:31:08

标签: r date dplyr padr

我正在浏览padr的文档:

https://cran.r-project.org/web/packages/padr/vignettes/padr.html

稍微更改晕影示例以对数据使用线性插值(zoo::na.approx())会产生错误:

library(tidyverse)
library(padr)
library(zoo)

set.seed(123)

emergency %>% 
  filter(title == 'EMS: DEHYDRATION') %>% 
  thicken(interval = 'day') %>% 
  group_by(time_stamp_day) %>% 
  summarise(nr = n() + as.integer(runif(1, 1, 999)) ) %>% 
  pad()

结果:

# A tibble: 307 × 2
   time_stamp_day    nr
           <date> <int>
1      2015-12-12    79
2      2015-12-13    42
3      2015-12-14    NA
4      2015-12-15    NA
5      2015-12-16    NA
6      2015-12-17    NA
7      2015-12-18    88
8      2015-12-19    NA
9      2015-12-20    NA
10     2015-12-21    NA
# ... with 297 more rows

现在我想要线性插入42到88 。我认为实现这一目标的最佳方法是在里面使用zoo::na.approx() padr::fill_by_function()

emergency %>% 
 filter(title == 'EMS: DEHYDRATION') %>% 
 thicken(interval = 'day') %>% 
 group_by(time_stamp_day) %>% 
 summarise(nr = n() + as.integer(runif(1, 1, 99)) ) %>% 
 pad() %>% 
 fill_by_function(nr, na.approx)

但是我收到以下错误:

Error in inds[i] <- which(colnames_x == as.character(cols[[i]])) : 
  replacement has length zero

有关如何开始解决此问题的任何想法?

1 个答案:

答案 0 :(得分:1)

您只需要mutatena.approx

library(tibble);library(zoo)
emergency <- as_tibble(read.table(text="time_stamp_day    nr
1      2015-12-12    79
2      2015-12-13    42
3      2015-12-14    NA
4      2015-12-15    NA
5      2015-12-16    NA
6      2015-12-17    NA
7      2015-12-18    88
8      2015-12-19    NA
9      2015-12-20    NA
10     2015-12-21    NA",header=TRUE,stringsAsFactors=FALSE))

emergency %>% mutate(nr=na.approx(nr,na.rm =FALSE))

# A tibble: 10 × 2
   time_stamp_day    nr
            <chr> <dbl>
1      2015-12-12  79.0
2      2015-12-13  42.0
3      2015-12-14  51.2
4      2015-12-15  60.4
5      2015-12-16  69.6
6      2015-12-17  78.8
7      2015-12-18  88.0
8      2015-12-19    NA
9      2015-12-20    NA
10     2015-12-21    NA