我有一个 data.table 如下 -
dt <- structure(list(date = structure(c(18415L, 18416L, 18417L, 18418L,
18421L, 18422L, 18423L, 18424L, 18425L, 18428L, 18429L, 18430L,
18431L, 18432L, 18435L, 18436L, 18437L, 18438L, 18439L, 18442L,
18443L, 18444L, 18445L, 18449L, 18450L, 18451L, 18452L, 18453L,
18456L, 18457L, 18458L, 18459L, 18460L, 18463L, 18464L, 18465L,
18466L, 18467L, 18470L, 18471L, 18472L, 18473L, 18474L, 18477L,
18478L, 18479L, 18480L, 18481L, 18484L, 18485L, 18486L, 18487L,
18491L, 18493L, 18494L, 18495L, 18498L, 18499L, 18500L, 18501L,
18502L, 18505L), class = c("IDate", "Date")),
close = c(12.11,
11.26, 10.8, 10.335, 10.55, 10.73, 10.74, 10.27, 10.36, 10.59,
10.72, 10.2, 10.22, 9.94, 9.92, 9.71, 10.13, 10.81, 10.87, 11.06,
11.63, 11.245, 12.02, 12.62, 12.97, 13.37, 13.85, 13.425, 13.97,
14.01, 14.7, 14.72, 16.15, 16.49, 16.93, 17.05, 16.85, 16.5,
17.7, 17.24, 17.495, 18.73, 18.15, 19.15, 19.29, 19.58, 13.09,
14.05, 14.55, 13.41, 13.79, 13.44, 13.58, 13.15, 13.19, 14.785,
14.415, 15.085, 14.41, 11.53, 11.69, 11.72),
group = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0)),
row.names = c(NA, -62L), class = c("data.table", "data.frame"))
看起来如下 -
date close group
1: 2020-06-02 12.110 0
2: 2020-06-03 11.260 0
3: 2020-06-04 10.800 0
4: 2020-06-05 10.335 0
5: 2020-06-08 10.550 0
6: 2020-06-09 10.730 0
7: 2020-06-10 10.740 0
8: 2020-06-11 10.270 0
9: 2020-06-12 10.360 0
10: 2020-06-15 10.590 1
11: 2020-06-16 10.720 1
12: 2020-06-17 10.200 1
13: 2020-06-18 10.220 0
14: 2020-06-19 9.940 0
15: 2020-06-22 9.920 1
16: 2020-06-23 9.710 0
17: 2020-06-24 10.130 1
18: 2020-06-25 10.810 1
19: 2020-06-26 10.870 1
20: 2020-06-29 11.060 1
21: 2020-06-30 11.630 1
22: 2020-07-01 11.245 1
23: 2020-07-02 12.020 1
24: 2020-07-06 12.620 1
25: 2020-07-07 12.970 1
26: 2020-07-08 13.370 1
27: 2020-07-09 13.850 1
28: 2020-07-10 13.425 1
29: 2020-07-13 13.970 1
30: 2020-07-14 14.010 1
31: 2020-07-15 14.700 1
32: 2020-07-16 14.720 1
33: 2020-07-17 16.150 1
34: 2020-07-20 16.490 1
35: 2020-07-21 16.930 1
36: 2020-07-22 17.050 1
37: 2020-07-23 16.850 1
38: 2020-07-24 16.500 1
39: 2020-07-27 17.700 1
40: 2020-07-28 17.240 0
41: 2020-07-29 17.495 0
42: 2020-07-30 18.730 0
43: 2020-07-31 18.150 0
44: 2020-08-03 19.150 0
45: 2020-08-04 19.290 1
46: 2020-08-05 19.580 0
47: 2020-08-06 13.090 0
48: 2020-08-07 14.050 0
49: 2020-08-10 14.550 0
50: 2020-08-11 13.410 0
51: 2020-08-12 13.790 0
52: 2020-08-13 13.440 0
53: 2020-08-17 13.580 0
54: 2020-08-19 13.150 1
55: 2020-08-20 13.190 0
56: 2020-08-21 14.785 1
57: 2020-08-24 14.415 1
58: 2020-08-25 15.085 1
59: 2020-08-26 14.410 1
60: 2020-08-27 11.530 0
61: 2020-08-28 11.690 0
62: 2020-08-31 11.720 0
date close group
close
列是股票的收盘价,group
列是买卖决策。如果 group
的值为 1
,则为买入决策,否则为卖出决策。
我需要向这个 data.table 添加一个新列,它将显示在任何给定时间每个 group
的累积回报。
累积回报是投资价格在设定时间内的总变化——总回报。
R 中是否有任何现有函数可以实现此目的?
提前致谢!
答案 0 :(得分:3)
也许这就是您要找的?
dt[,.(start = head(date,1),
end = tail(date,1),
change = tail(close,1)-head(close,1)),
by = rleid(group)]
rleid start end change
1: 1 2020-06-02 2020-06-12 -1.750
2: 2 2020-06-15 2020-06-17 -0.390
3: 3 2020-06-18 2020-06-19 -0.280
4: 4 2020-06-22 2020-06-22 0.000
5: 5 2020-06-23 2020-06-23 0.000
6: 6 2020-06-24 2020-07-27 7.570
7: 7 2020-07-28 2020-08-03 1.910
8: 8 2020-08-04 2020-08-04 0.000
9: 9 2020-08-05 2020-08-17 -6.000
10: 10 2020-08-19 2020-08-19 0.000
11: 11 2020-08-20 2020-08-20 0.000
12: 12 2020-08-21 2020-08-26 -0.375
13: 13 2020-08-27 2020-08-31 0.190
data.table head
和 tail
方法非常有效。
答案 1 :(得分:2)
使用 dplyr
library(dplyr)
dt %>%
group_by(grp = with(rle(group), rep(seq_along(values), lengths))) %>%
summarise(start = first(date), end = last(date),
change = last(close) - first(close), .groups = 'drop')
-输出
# A tibble: 13 x 4
# grp start end change
# <int> <date> <date> <dbl>
# 1 1 2020-06-02 2020-06-12 -1.75
# 2 2 2020-06-15 2020-06-17 -0.39
# 3 3 2020-06-18 2020-06-19 -0.28
# 4 4 2020-06-22 2020-06-22 0
# 5 5 2020-06-23 2020-06-23 0
# 6 6 2020-06-24 2020-07-27 7.57
# 7 7 2020-07-28 2020-08-03 1.91
# 8 8 2020-08-04 2020-08-04 0
# 9 9 2020-08-05 2020-08-17 -6.00
#10 10 2020-08-19 2020-08-19 0
#11 11 2020-08-20 2020-08-20 0
#12 12 2020-08-21 2020-08-26 -0.375
#13 13 2020-08-27 2020-08-31 0.19