观察到任何滞后后将观察值设置为NA

时间:2019-05-03 06:53:02

标签: r dataframe dplyr

如果我两个月回报的滞后时间小于-0.7,我想将下一个x <- x %>% group_by(seriesid) %>% mutate(first = { twomonthreturn <= (-0.7) } %>% { . * !duplicated(.) } ) %>% mutate(first = ifelse(first==1,datem,NA)) #%>% mutate(closeret = ifelse(datem<=sum(first,na.rm = TRUE),closeret,NA)) 观察值设置为NA。

我尝试过:

date seriesid price closeret twomonthreturn
2018-07-25 50005 3.100 NA NA
2018-08-14 50005 2.500-0.19354839 NA
2018-09-28 50005 2.350 -0.06000000 -0.24193548
2018-10-27 50005 0.800 -0.65957447 -0.68000000 
2018-11-27 50005 0.500 -0.37500000 -0.7872340
2018-12-31 50005 0.300 -0.40000000 -0.62500000
2019-01-26 50005 0.360 0.20000000 -0.28000000
2019-02-23 50005 0.300 -0.16666667 0.00000000
2017-01-21 50006 7.000 NA NA
2017-03-28 50006 9.750 NA NA
2017-04-14 50006 8.875 -0.08974359 NA
2017-05-20 50006 9.000 0.01408451 -0.07692308
2017-06-22 50006 9.000 0.00000000 0.01408451

其中datem =年(日期)* 12 +月(日期)

date seriesid price closeret twomonthreturn
2018-07-25 50005 3.100 NA NA
2018-08-14 50005 2.500-0.19354839 NA
2018-09-28 50005 2.350 -0.06000000 -0.24193548
2018-10-27 50005 0.800 -0.65957447 -0.68000000 
2018-11-27 50005 0.500 -0.37500000 -0.7872340
2018-12-31 50005 0.300 NA -0.62500000
2019-01-26 50005 0.360 NA -0.28000000
2019-02-23 50005 0.300 NA 0.00000000
2017-01-21 50006 7.000 NA NA
2017-03-28 50006 9.750 NA NA
2017-04-14 50006 8.875 -0.08974359 NA
2017-05-20 50006 9.000 0.01408451 -0.07692308
2017-06-22 50006 9.000 0.00000000 0.01408451

我正在寻找每个组的解决方案,我的解决方案适用于第一个组,但是如果没有第一个解决方案(因为它是NA),那么此解决方案将不起作用。

service

1 个答案:

答案 0 :(得分:0)

希望这将起作用:

解决方案

import urllib.request as urllib2 # change of IP address page = urllib2.urlopen("http://example.com/").read() replace一起用作cumsum(lag(replace_na(twomonthreturn, Inf), , Inf) <= -.7) > 0- 0.7之后组中所有值的索引,因此它将全部替换twomonthreturnclosered中的此类值。

NA

输出

library(tidyverse)

dat %>%
  arrange(seriesid, date) %>%
  group_by(seriesid) %>%
  mutate(
    closeret = replace(
      closeret,
      cumsum(lag(replace_na(twomonthreturn, Inf), , Inf) <= -.7) > 0,
      NA_real_
      )
    ) %>%
  ungroup()

数据

# A tibble: 13 x 5
   date       seriesid price closeret twomonthreturn
   <date>        <dbl> <dbl>    <dbl>          <dbl>
 1 2018-07-25    50005  3.1   NA             NA     
 2 2018-08-14    50005  2.5   -0.194         NA     
 3 2018-09-28    50005  2.35  -0.06          -0.242 
 4 2018-10-27    50005  0.8   -0.660         -0.68  
 5 2018-11-27    50005  0.5   -0.375         -0.787 
 6 2018-12-31    50005  0.3   NA             -0.625 
 7 2019-01-26    50005  0.36  NA             -0.28  
 8 2019-02-23    50005  0.3   NA              0     
 9 2017-01-21    50006  7     NA             NA     
10 2017-03-28    50006  9.75  NA             NA     
11 2017-04-14    50006  8.88  -0.0897        NA     
12 2017-05-20    50006  9      0.0141        -0.0769
13 2017-06-22    50006  9      0              0.0141