计算R中每一行到特定百分比变化的持续时间

时间:2018-08-30 20:50:00

标签: r time-series

对于每个组和日期,我想知道列value的变化百分比何时增加1%或ore。更具体地说,我想知道每个值增加1%或更多的持续时间(以天为单位)。例如,对于A组,从11/1/17开始花费了8天的时间该值增加了1%。 (101-100)/ 100。因此,对于下一行(A组,11/2/17),花费了7天。而且,(B组,11/1/17)花了3天的时间才能提高1%或更多(105-100)/ 100。

    +-------+---------+--------+
| Group |  Date   | value  |
+-------+---------+--------+
| A     | 11/1/17 |    100 |
| A     | 11/2/17 |    100 |
| A     | 11/3/17 |    100 |
| A     | 11/4/17 |    100 |
| A     | 11/5/17 |    100 |
| A     | 11/6/17 |    100 |
| A     | 11/7/17 |    100 |
| A     | 11/8/17 |    100 |
| A     | 11/9/17 |    101 |
| B     | 11/1/17 |    100 |
| B     | 11/2/17 |    100 |
| B     | 11/3/17 |    100 |
| B     | 11/4/17 |    105 |
| B     | 11/5/17 |    100 |
| B     | 11/6/17 |    107 |
| B     | 11/7/17 |    100 |
| B     | 11/8/17 |    100 |
+-------+---------+--------+

这是所需的输出,

+-------+---------+--------+---------------------------------+
| Group |  Date   | value  | next_1_percent_or_higher_change |
+-------+---------+--------+---------------------------------+
| A     | 11/1/17 |    100 | 8                               |
| A     | 11/2/17 |    100 | 7                               |
| A     | 11/3/17 |    100 | 6                               |
| A     | 11/4/17 |    100 | 5                               |
| A     | 11/5/17 |    100 | 4                               |
| A     | 11/6/17 |    100 | 3                               |
| A     | 11/7/17 |    100 | 2                               |
| A     | 11/8/17 |    100 | 1                               |
| A     | 11/9/17 |    101 | NA                              |
| B     | 11/1/17 |    100 | 3                               |
| B     | 11/2/17 |    100 | 2                               |
| B     | 11/3/17 |    100 | 1                               |
| B     | 11/4/17 |    105 | 2                               |
| B     | 11/5/17 |    100 | 1                               |
| B     | 11/6/17 |    107 | NA                              |
| B     | 11/7/17 |    100 | NA                              |
| B     | 11/8/17 |    100 | NA                              |
+-------+---------+--------+---------------------------------+

更新

到目前为止,这是我所拥有的,但是我的解决方案不可扩展。

shift <- function(x, n){
   c(x[-(seq(n))], rep(NA, n))
 }




df= do.call(rbind,by(df,df$Group, transform,next_1_percent_or_higher_change =
                        ifelse(((shift(value,1)-value)/value) >= .01,1,
                               ifelse(((shift(value,2)-value)/value) >= .01,2,
                               ifelse(((shift(value,3)-value)/value) >= .01,3,
                                      ifelse(((shift(value,4)-value)/value) >= .01,4,
                                             ifelse(((shift(value,5)-value)/value) >= .01,5,
                                                    ifelse(((shift(value,6)-value)/value) >= .01,6,
                                                           ifelse(((shift(value,7)-value)/value) >= .01,7,
                                                                  ifelse(((shift(value,8)-value)/value) >= .01,8,
                                                                         ifelse(((shift(value,9)-value)/value) >= .01,9,NA)))))))))))

1 个答案:

答案 0 :(得分:0)

也许是这样吗?

library(tidyverse)
library(lubridate)
df %>%
    group_by(Group) %>%
    arrange(Group, Date) %>%
    mutate(
        Date = mdy(Date),
        next_1_percent_or_higher_change  = Date[which(value == 101)] - Date) %>%
    mutate(next_1_percent_or_higher_change  = replace(next_1_percent_or_higher_change, next_1_percent_or_higher_change <= 0, NA))
## A tibble: 17 x 4
## Groups:   Group [2]
#   Group Date       value next_1_percent_or_higher_change
#   <fct> <date>     <dbl> <time>
# 1 A     2017-11-01  100. 8
# 2 A     2017-11-02  100. 7
# 3 A     2017-11-03  100. 6
# 4 A     2017-11-04  100. 5
# 5 A     2017-11-05  100. 4
# 6 A     2017-11-06  100. 3
# 7 A     2017-11-07  100. 2
# 8 A     2017-11-08  100. 1
# 9 A     2017-11-09  101. NA
#10 B     2017-11-01  100. 3
#11 B     2017-11-02  100. 2
#12 B     2017-11-03  100. 1
#13 B     2017-11-04  101. NA
#14 B     2017-11-05  100. NA
#15 B     2017-11-06  100. NA
#16 B     2017-11-07  100. NA
#17 B     2017-11-08  100. NA

样本数据

df <- read.table(text =
    "Group   Date    value
 A      11/1/17     100
 A      11/2/17  100.01
 A      11/3/17  100.02
 A      11/4/17  100.03
 A      11/5/17  100.04
 A      11/6/17  100.05
 A      11/7/17  100.06
 A      11/8/17  100.07
 A      11/9/17     101
 B      11/1/17  100.01
 B      11/2/17  100.02
 B      11/3/17  100.03
 B      11/4/17     101
 B      11/5/17  100.05
 B      11/6/17  100.06
 B      11/7/17  100.07
 B      11/8/17  100.07 ", header = T)