我的数据具有所需的范围,但会进入被认为过高或过低的区域。我希望能够将点的实例分组为太高或太低作为单独的实例。 我在这里做了一些假数据:
library(dplyr)
library(ggplot2)
set.seed(123432)
dat <- data.frame(value = sample(20:600, 20, replace=F))%>%
mutate(ord = row_number(),
cat = ifelse(value > 350, "high",
ifelse(value < 90, "low", "good")),
extreme = ifelse(cat=="high" & value > lag(value) & value > lead(value), "Peak",
ifelse(cat=="low" & value < lag(value) & value < lead(value), "Trough", "")))
这里有一张图表:
ggplot(dat, aes(x = ord, y = value))+
geom_point()+
geom_line()+
geom_hline(yintercept = 300, color="blue")+
geom_hline(yintercept = 120, color="blue")+
coord_fixed(.025)
我知道如何将这些高级&amp; excel中的低区域,但似乎无法在R中复制它。我想生成这样的东西(虽然E1将是&#34; Series&#34;):
注意栏E基于C&amp; C列;每个系列可以有多个峰值/谷值。
我希望这很清楚,你们大家可以提供帮助。如果可能的话,我想坚持使用dplyr。
谢谢。
答案 0 :(得分:2)
根据您在评论中的描述,我认为这正是您所寻找的。请注意,我使用变量n
参数化了长度:
library(dplyr)
library(ggplot2)
set.seed(123432)
n <- 20
dat <- data.frame(value = sample(20:600, n, replace=F))%>%
mutate(ord = row_number(),
cat = ifelse(value > 350, "high",
ifelse(value < 90, "low", "good")),
extreme = ifelse(cat=="high" & value > lag(value) &
value > lead(value), "Peak",
ifelse(cat=="low" & value < lag(value) &
value < lead(value), "Trough", "")),
c1 = cat,
c2 = c(cat[1],cat[1:(n-1)]),
chg = cumsum(c2!=c1)+1 )
得到以下特性:
value ord cat extreme c1 c2 chg
1 96 1 good good good 1
2 254 2 good good good 1
3 458 3 high Peak high good 2
4 453 4 high high high 2
5 567 5 high Peak high high 2
6 313 6 good good high 3
7 353 7 high Peak high good 4
8 20 8 low Trough low high 5
9 487 9 high Peak high low 6
10 48 10 low Trough low high 7
11 288 11 good good low 8
12 171 12 good good good 8
13 175 13 good good good 8
14 462 14 high Peak high good 9
15 95 15 good good high 10
16 360 16 high high good 11
17 407 17 high high high 11
18 484 18 high Peak high high 11
19 159 19 good good high 12
20 36 20 low <NA> low good 13