Question

我有一个包含三列的csv文件。第一列是五元组日期（一年73个五元组），第二列和第三列是降水值。

我想做什么：

[1]。当降水量超过＆＃34;年平均值时，得到第一个五边形。在＆＃34;至少连续三个五人组＆＃34;。

我可以像这样对第一列进行子集化：

dat<-read.csv("test.csv",header=T,sep=",")
aa<-which(dat$RR>mean(dat$RR))

这给了我以下内容：

[1] 27 28 29 30 31 34 36 37 38 41 42 43 44 45 46 52 53 54 55 56 57

在这种情况下，正确的输出应为P27。

在第二栏：

[1] 31 32 36 38 39 40 41 42 43 44 45 46 47 48 49 50 53 54 55 57 59 60 61

正确的输出应为P38。

如何在此处添加条件语句，并考虑＆＃34;连续三个五元组＆＃34;？

我不知道如何在R中实现它（在代码中）。我会感激任何建议。

我有以下数据：

Pentad  RR  YY
1   0   0.5771428571
2   0.0142857143    0
3   0   1.2828571429
4   0.0885714286    1.4457142857
5   0.0714285714    0.1114285714
6   0   0.36
7   0.0657142857    0
8   0.0285714286    0
9   0.0942857143    0
10  0.0114285714    1
11  0   0.0114285714
12  0   0.0085714286
13  0   0.3057142857
14  0   0
15  0   0
16  0   0
17  0.04    0
18  0   0.8
19  0.8142857143    0.0628571429
20  0.2857142857    0
21  1.14    0
22  5.3342857143    0
23  2.3514285714    0
24  1.9857142857    0.0133333333
25  1.4942857143    0.0433333333
26  2.0057142857    1.4866666667
27  20.0485714286   0
28  25.0085714286   2.4866666667
29  16.32   1.9433333333
30  11.0685714286   0.7733333333
31  8.9657142857    8.1066666667
32  3.9857142857    7.7333333333
33  5.2028571429    0.5
34  7.8028571429    4.3566666667
35  4.4514285714    2.66
36  9.22    6.6266666667
37  32.0485714286   4.4042857143
38  19.5057142857   7.9771428571
39  3.1485714286    12.9428571429
40  2.4342857143    18.4942857143
41  9.0571428571    7.3571428571
42  28.7085714286   11.0828571429
43  34.1514285714   9.0342857143
44  33.0257142857   14.2914285714
45  46.5057142857   34.6142857143
46  70.6171428571   45.3028571429
47  3.1685714286    6.66
48  1.9285714286    6.7028571429
49  7.0314285714    5.9628571429
50  0.9028571429    14.8542857143
51  5.3771428571    2.1
52  11.3571428571   2.8371428571
53  15.0457142857   7.3914285714
54  11.6628571429   32.0371428571
55  21.24   9.0057142857
56  11.4371428571   3.5257142857
57  11.6942857143   12.32
58  2.9771428571    2.32
59  4.3371428571    7.9942857143
60  0.8714285714    6.5657142857
61  1.3914285714    4.7714285714
62  0.8714285714    2.3542857143
63  1.1457142857    0.0057142857
64  2.3171428571    2.5085714286
65  0.1828571429    0.8171428571
66  0.2828571429    2.8857142857
67  0.3485714286    0.8971428571
68  0   0
69  0.3457142857    0
70  0.1428571429    0
71  0.18    0
72  4.8942857143    0.1457142857
73  0.0371428571    0.4342857143

Answer 1

这样的事情应该这样做：

first_exceed_seq <- function(x, thresh = mean(x), len = 3)
{
  # Logical vector, does x exceed the threshold
  exceed_thresh <- x > thresh

  # Indices of transition points; where exceed_thresh[i - 1] != exceed_thresh[i]
  transition <- which(diff(c(0, exceed_thresh)) != 0)

  # Reference index, grouping observations after each transition
  index <- vector("numeric", length(x))
  index[transition] <- 1
  index <- cumsum(index)

  # Break x into groups following the transitions
  exceed_list <- split(exceed_thresh, index)

  # Get the number of values exceeded in each index period
  num_exceed <- vapply(exceed_list, sum, numeric(1))

  # Get the starting index of the first sequence where more then len exceed thresh
  transition[as.numeric(names(which(num_exceed >= len))[1])]
}

first_exceed_seq(dat$RR)
first_exceed_seq(dat$YY)

在序列中获得超过阈值的第一个超出日期

1 个答案: