查找系列的开始/结束,执行它们之间的%差异

时间:2017-07-30 04:04:38

标签: r

我有一系列信号,我可以在其中查找序列的开头,这将是+1中的第一个signal column。当找到first +1时,就要记住Open Price。然后循环直到它在序列中找到last +1,否则它将保存Close Price。一旦获得两个价格:计算出两者之间的差异。 (Close value - Open Value) / Open value。继续循环直到下一个+1

示例数据/输出如下:

df <- data.frame(df,output=0)

      Open   Close signal       output
1  1469.25 1455.17      0  0.000000000
2  1455.22 1399.42      0  0.000000000
3  1399.42 1402.11      1  0.000000000
4  1402.11 1403.45      1  0.002879700
5  1403.45 1441.47      0  0.000000000
6  1441.47 1457.60      0  0.000000000
7  1457.60 1438.56      0  0.000000000
8  1438.56 1432.04      0  0.000000000
9  1432.25 1449.68      0  0.000000000
10 1449.68 1465.15      0  0.000000000
11 1465.15 1455.14      0  0.000000000
12 1455.14 1455.90      0  0.000000000
13 1455.90 1445.57      0  0.000000000
14 1445.57 1441.36      0  0.000000000
15 1441.36 1401.68      0  0.000000000
16 1401.53 1410.03      1  0.000000000
17 1410.03 1404.09      1  0.001826532
18 1404.09 1398.56      0  0.000000000
19 1398.56 1360.15      0  0.000000000
20 1360.16 1394.46      1  0.000000000
21 1394.46 1409.28      1  0.036113398
22 1409.28 1409.12      0  0.000000000
23 1409.12 1424.97      0  0.000000000
24 1424.97 1424.37      0  0.000000000
25 1424.37 1424.24      0  0.000000000
26 1424.24 1441.75      0  0.000000000
27 1441.72 1411.71      0  0.000000000
28 1411.70 1416.84      0  0.000000000
29 1416.83 1387.11      0  0.000000000
30 1387.12 1389.94      0  0.000000000
31 1389.94 1402.05      0  0.000000000
32 1402.05 1387.67      0  0.000000000
33 1387.67 1388.25      1  0.000000000
34 1388.26 1346.09      1  0.000000000
35 1346.09 1352.23      1  0.000000000
36 1352.17 1360.69      1  0.000000000
37 1360.69 1353.43      1  0.000000000
38 1353.43 1333.36      1  0.000000000
39 1333.36 1348.05      1 -0.028551449
40 1348.05 1366.42      0  0.000000000
41 1366.42 1379.23      0  0.000000000

以第33至39行为例: 它在第33行找到+1 它节省了开放价格以便以后使用 循环直到第39行的最后+1。 然后它保存了收盘价。 现在我们有两个价格。我们可以执行两者之间的%差异。 继续循环直到下一个+1信号

这是我的屠杀尝试:

state <- "off"
for (i in 1:nrow(df)) { # loop through data
  if (state == "off") { # off state, loop does nothing until signal = 1
    if (df$signal[i] == 0) {
      next
    } else { 
      open_price <- df$Open[i] # save open price for % calculation
       state <- "on"                   # change state to "on"
    }
  } else if (state == "on") { 
    if (df$signal[i] > 0) { # Find last +1
      close_price <- df$Close[i] # save close price for % calculation
      output <- (close_price - open_price)/ open_price # perform % calculation
      state <- "off" 
  }
  }
}

想法是打开/关闭状态..在+ 1期间打开,查找开始/结束值。 %diff计算,关闭余数

1 个答案:

答案 0 :(得分:1)

以下是使用dplyr包中的rleid包和data.table函数的函数的解决方案。 dt2是最终输出。

# Load packages
library(dplyr)

# Process the data
dt2 <- dt %>%
  mutate(RunID = data.table::rleid(signal)) %>%
  group_by(RunID) %>%
  mutate(output = ifelse(signal == 0, 0,
                         ifelse(row_number() == n(),
                                (last(Close) - first(Open))/first(Open), 0))) %>%
  ungroup() %>%
  select(-RunID)

输入数据

dt <- structure(list(Open = c(1469.25, 1455.22, 1399.42, 1402.11, 1403.45, 
1441.47, 1457.6, 1438.56, 1432.25, 1449.68, 1465.15, 1455.14, 
1455.9, 1445.57, 1441.36, 1401.53, 1410.03, 1404.09, 1398.56, 
1360.16, 1394.46, 1409.28, 1409.12, 1424.97, 1424.37, 1424.24, 
1441.72, 1411.7, 1416.83, 1387.12, 1389.94, 1402.05, 1387.67, 
1388.26, 1346.09, 1352.17, 1360.69, 1353.43, 1333.36, 1348.05, 
1366.42), Close = c(1455.17, 1399.42, 1402.11, 1403.45, 1441.47, 
1457.6, 1438.56, 1432.04, 1449.68, 1465.15, 1455.14, 1455.9, 
1445.57, 1441.36, 1401.68, 1410.03, 1404.09, 1398.56, 1360.15, 
1394.46, 1409.28, 1409.12, 1424.97, 1424.37, 1424.24, 1441.75, 
1411.71, 1416.84, 1387.11, 1389.94, 1402.05, 1387.67, 1388.25, 
1346.09, 1352.23, 1360.69, 1353.43, 1333.36, 1348.05, 1366.42, 
1379.23), signal = c(0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L)), .Names = c("Open", 
"Close", "signal"), row.names = c("1", "2", "3", "4", "5", "6", 
"7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", 
"18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", 
"29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", 
"40", "41"), class = "data.frame")