R创建新列并将值移动1行

时间:2019-02-04 16:14:54

标签: r

我想创建一个新列,然后在下一行中获取第二天的值。在我的示例数据框中,我有3列:日期,价格和退货。现在,我想检测过​​度反应。如果收益率高于平均值+1标准偏差,则收益率反应过度。如果不是,则值为“ NA”。

library(tidyverse)
library(quantmod)

df <- tibble( 
date = lubridate::today() +0:9,
price = c(1,2.5,2,3,5,6.5,4,9,3,4))

df <- mutate(df, return = Delt(price))

df <- df %>% mutate(overreaction= 
                  ifelse(return >  mean(df$return, na.rm = TRUE) +  sd(df$return, na.rm = TRUE),
                   yes = return, no = NA
          )
)

现在,我要创建一个新列,如果前一天发生了过度反应,那么第二天就可以返回。

df <- df %>% mutate(following_day = 
                  ifelse(overreaction != "NA",
                         yes= return%>% data.table::shift(n=1L, fill=NA, type=c("lead")),
                         no=NA)
                )

print(df)
# A tibble: 10 x 5
   date       price                  return                     overreaction               following_day
   <date>     <dbl>                    <dbl>                      <dbl>                       <dbl>
 1 2019-02-04   1                     NA                          NA                         NA    
 2 2019-02-05   2.5                    1.5                         1.5                       -0.200
 3 2019-02-06   2                     -0.200                      NA                         NA    
 4 2019-02-07   3                      0.5                        NA                         NA    
 5 2019-02-08   5                      0.667                      NA                         NA    
 6 2019-02-09   6.5                    0.3                        NA                         NA    
 7 2019-02-10   4                     -0.385                      NA                         NA    
 8 2019-02-11   9                      1.25                        1.25                      -0.667
 9 2019-02-12   3                     -0.667                      NA                         NA    
10 2019-02-13   4                      0.333                      NA                         NA    

它起作用,除了一个问题: 我希望after_day-column中的值移位1行,以便它们位于原始位置。 数据框应如下所示:

# A tibble: 10 x 5
   date       price                  return                     overreaction               following_day
   <date>     <dbl>                    <dbl>                      <dbl>                       <dbl>
 1 2019-02-04   1                     NA                          NA                         NA    
 2 2019-02-05   2.5                    1.5                         1.5                       NA
 3 2019-02-06   2                     -0.200                      NA                         -0.200    
 4 2019-02-07   3                      0.5                        NA                         NA    
 5 2019-02-08   5                      0.667                      NA                         NA    
 6 2019-02-09   6.5                    0.3                        NA                         NA    
 7 2019-02-10   4                     -0.385                      NA                         NA    
 8 2019-02-11   9                      1.25                        1.25                      NA
 9 2019-02-12   3                     -0.667                      NA                         -0.667    
10 2019-02-13   4                      0.333                      NA                         NA  

有人可以帮我吗?

1 个答案:

答案 0 :(得分:0)

df$following_day中加上dplyr::lag

library(tidyverse)
library(quantmod)

df <- tibble( 
  date = lubridate::today() +0:9,
  price = c(1,2.5,2,3,5,6.5,4,9,3,4)) %>% 
  mutate(return= Delt(price))

df <- mutate(df, overreaction = 
                      ifelse( return > mean(df$return, na.rm = TRUE) +  sd(df$return, na.rm = TRUE),
                                          return, NA))

df <- mutate(df, following_day = ifelse(!is.na(overreaction),
                                           data.table::shift(df$return, type = "lead"),
                                           NA))

df$following_day <- dplyr::lag(df$following_day) # or data.table::shift

输出:

> df
# A tibble: 10 x 5
   date       price  return overreaction following_day
   <date>     <dbl>   <dbl>        <dbl>         <dbl>
 1 2019-02-04   1    NA            NA           NA    
 2 2019-02-05   2.5   1.5           1.5         NA    
 3 2019-02-06   2    -0.200        NA           -0.200
 4 2019-02-07   3     0.5          NA           NA    
 5 2019-02-08   5     0.667        NA           NA    
 6 2019-02-09   6.5   0.3          NA           NA    
 7 2019-02-10   4    -0.385        NA           NA    
 8 2019-02-11   9     1.25          1.25        NA    
 9 2019-02-12   3    -0.667        NA           -0.667
10 2019-02-13   4     0.333        NA           NA  

dplyr::lagdata.table::shift(df$following_day, type = "lag")

可以实现相同的目的