对于每个组和日期,我想知道列value
的变化百分比何时增加1%或ore。更具体地说,我想知道每个值增加1%或更多的持续时间(以天为单位)。例如,对于A组,从11/1/17开始花费了8天的时间该值增加了1%。 (101-100)/ 100。因此,对于下一行(A组,11/2/17),花费了7天。而且,(B组,11/1/17)花了3天的时间才能提高1%或更多(105-100)/ 100。
+-------+---------+--------+
| Group | Date | value |
+-------+---------+--------+
| A | 11/1/17 | 100 |
| A | 11/2/17 | 100 |
| A | 11/3/17 | 100 |
| A | 11/4/17 | 100 |
| A | 11/5/17 | 100 |
| A | 11/6/17 | 100 |
| A | 11/7/17 | 100 |
| A | 11/8/17 | 100 |
| A | 11/9/17 | 101 |
| B | 11/1/17 | 100 |
| B | 11/2/17 | 100 |
| B | 11/3/17 | 100 |
| B | 11/4/17 | 105 |
| B | 11/5/17 | 100 |
| B | 11/6/17 | 107 |
| B | 11/7/17 | 100 |
| B | 11/8/17 | 100 |
+-------+---------+--------+
这是所需的输出,
+-------+---------+--------+---------------------------------+
| Group | Date | value | next_1_percent_or_higher_change |
+-------+---------+--------+---------------------------------+
| A | 11/1/17 | 100 | 8 |
| A | 11/2/17 | 100 | 7 |
| A | 11/3/17 | 100 | 6 |
| A | 11/4/17 | 100 | 5 |
| A | 11/5/17 | 100 | 4 |
| A | 11/6/17 | 100 | 3 |
| A | 11/7/17 | 100 | 2 |
| A | 11/8/17 | 100 | 1 |
| A | 11/9/17 | 101 | NA |
| B | 11/1/17 | 100 | 3 |
| B | 11/2/17 | 100 | 2 |
| B | 11/3/17 | 100 | 1 |
| B | 11/4/17 | 105 | 2 |
| B | 11/5/17 | 100 | 1 |
| B | 11/6/17 | 107 | NA |
| B | 11/7/17 | 100 | NA |
| B | 11/8/17 | 100 | NA |
+-------+---------+--------+---------------------------------+
更新
到目前为止,这是我所拥有的,但是我的解决方案不可扩展。
shift <- function(x, n){
c(x[-(seq(n))], rep(NA, n))
}
df= do.call(rbind,by(df,df$Group, transform,next_1_percent_or_higher_change =
ifelse(((shift(value,1)-value)/value) >= .01,1,
ifelse(((shift(value,2)-value)/value) >= .01,2,
ifelse(((shift(value,3)-value)/value) >= .01,3,
ifelse(((shift(value,4)-value)/value) >= .01,4,
ifelse(((shift(value,5)-value)/value) >= .01,5,
ifelse(((shift(value,6)-value)/value) >= .01,6,
ifelse(((shift(value,7)-value)/value) >= .01,7,
ifelse(((shift(value,8)-value)/value) >= .01,8,
ifelse(((shift(value,9)-value)/value) >= .01,9,NA)))))))))))
答案 0 :(得分:0)
也许是这样吗?
library(tidyverse)
library(lubridate)
df %>%
group_by(Group) %>%
arrange(Group, Date) %>%
mutate(
Date = mdy(Date),
next_1_percent_or_higher_change = Date[which(value == 101)] - Date) %>%
mutate(next_1_percent_or_higher_change = replace(next_1_percent_or_higher_change, next_1_percent_or_higher_change <= 0, NA))
## A tibble: 17 x 4
## Groups: Group [2]
# Group Date value next_1_percent_or_higher_change
# <fct> <date> <dbl> <time>
# 1 A 2017-11-01 100. 8
# 2 A 2017-11-02 100. 7
# 3 A 2017-11-03 100. 6
# 4 A 2017-11-04 100. 5
# 5 A 2017-11-05 100. 4
# 6 A 2017-11-06 100. 3
# 7 A 2017-11-07 100. 2
# 8 A 2017-11-08 100. 1
# 9 A 2017-11-09 101. NA
#10 B 2017-11-01 100. 3
#11 B 2017-11-02 100. 2
#12 B 2017-11-03 100. 1
#13 B 2017-11-04 101. NA
#14 B 2017-11-05 100. NA
#15 B 2017-11-06 100. NA
#16 B 2017-11-07 100. NA
#17 B 2017-11-08 100. NA
df <- read.table(text =
"Group Date value
A 11/1/17 100
A 11/2/17 100.01
A 11/3/17 100.02
A 11/4/17 100.03
A 11/5/17 100.04
A 11/6/17 100.05
A 11/7/17 100.06
A 11/8/17 100.07
A 11/9/17 101
B 11/1/17 100.01
B 11/2/17 100.02
B 11/3/17 100.03
B 11/4/17 101
B 11/5/17 100.05
B 11/6/17 100.06
B 11/7/17 100.07
B 11/8/17 100.07 ", header = T)