Question

在下面的示例中，我的目标是显示df中连续编号转换为df_new的连续数字低于threshold中-1.2的{{1}}的年份。 }连续实例。然后，我想从列5返回相应的唯一值作为结果。我在链接df_new$year函数结果时遇到的问题是该长度与rle()的长度不对应，因此无法正确地对其进行索引。 df_new$year函数的问题在于它不返回零，因此仅返回rle()中的threshold以下至少1个值的游程。我该如何改进这段代码来实现我所需要的？是否有一种方法可以强制rle（）在k中包含零，还是我应该采用另一种方法？

当前结果：

# Example reproducible df:
set.seed(125)
df <- data.frame(V1=rnorm(10,-1.5,.5),
                 V2=rnorm(10,-1.5,.5),
                 V3=rnorm(10,-1.5,.5),
                 V4=rnorm(10,-1.5,.5),
                 V5=rnorm(10,-1.5,.5),
                 V6=rnorm(10,-1.5,.5),
                 V7=rnorm(10,-1.5,.5),
                 V8=rnorm(10,-1.5,.5),
                 V9=rnorm(10,-1.5,.5),
                 V10=rnorm(10,-1.5,.5))
library(data.table)
df_t <- t(df)
df_long <- melt(df_t)
df_long$year <- rep(1976:1985, each=nrow(df))
df_new <- data.frame(value=df_long$value,year=df_long$year)

# Threshold values:
 threshold = -1.2
    consecutiveentries = 5
    number <- consecutiveentries-1
# Start of the problem:
    k <- rle(df_new$value < threshold)
    years <- unique(df_new$year[k$lengths > number])

我想要的是什么

> years
[1] 1976 1978 1979 1980 1982 1984 1985

Answer 1

这很难看，但是可以用：）

df_new$year[cumsum(k$lengths)[which(k$lengths >= 5)-1]+1]

每个部分：

idx <- which(k$lengths >= 5)-1为您提供k$lengths的索引，该索引恰好在一个值大于或等于4之前。

然后使用cumsum(k$lengths)在k$lengths上建立累积和，并取idx上的元素。结果，我们得到了>=5序列中第一行之前的行数。

将此结果加1可以为我们提供每个序列开始的行的索引。

使用rle（）为data.frame建立索引-如何在函数中显示零以保持相同的矢量长度？

1 个答案: