我需要找到增加或减少趋势开始和结束的点。在该数据中,连续值之间的差值~10被认为是噪声(即,不是增加或减少)。根据下面给出的样本数据,第一个增加趋势将从317开始,结束于432,另一个将从441开始,到983结束。这些点中的每一个都将记录在一个单独的向量中。
sample<- c(312,317,380,432,438,441,509,641,779,919,
983,980,978,983,986,885,767,758,755)
以下是主要变化点的图像。有人可以建议一个R方法吗?
答案 0 :(得分:1)
以下是如何制作变更点矢量:
vec <- c(100312,100317,100380,100432,100438,100441,100509,100641,100779,100919,
100983,100980,100978,100983,100986,100885,100767,100758,100755,100755)
#this finds your trend start/stops
idx <- c(cumsum(rle(abs(diff(vec))>10)$lengths)+1)
#create new vector of change points:
newVec <- vec[idx]
print(newVec)
[1] 100317 100432 100441 100983 100986 100767 100755
#(opt.) to ignore the first and last observation as a change point:
idx <- idx[which(idx!=1 & idx!=length(vec))]
#update new vector if you want the "opt." restrictions applied:
newVec <- vec[idx]
print(newVec)
[1] 100317 100432 100441 100983 100986 100767
#you can split newVec by start/stop change points like this:
start_changepoints <- newVec[c(TRUE,FALSE)]
print(start_changepoints)
[1] 100317 100441 100986
end_changepoints <- newVec[c(FALSE,TRUE)]
print(end_changepoints)
[1] 100432 100983 100767
#to count the number of events, just measure the length of start_changepoints:
length(start_changepoints)
[1] 3
如果您想绘制它,可以使用:
require(ggplot2)
#preps data for plot
df <- data.frame(vec,trends=NA,cols=NA)
df$trends[idx] <- idx
df$cols[idx] <- c("green","red")
#plot
ggplot(df, aes(x=1:NROW(df),y=vec)) +
geom_line() +
geom_point() +
geom_vline(aes(xintercept=trends, col=cols),
lty=2, lwd=1) +
scale_color_manual(values=na.omit(df$cols),
breaks=na.omit(unique(df$cols)),
labels=c("Start","End")) +
xlab("Index") +
ylab("Value") +
guides(col=guide_legend("Trend State"))
输出: