尝试在数据帧中首次出现的变量提取到已在数据帧中选择的特定值之前。具体来说,head(df)
的输出为:
date discharge event event.isolation some.column
1/1/2016 7.782711 NA NA FALSE
1/2/2016 7.349389 -5.567748 none TRUE
1/3/2016 7.053813 -4.021769 none TRUE
1/4/2016 7.421568 5.213554 none TRUE
1/5/2016 5.722443 -22.894418 none TRUE
1/6/2016 5.497342 -3.933662 none TRUE
1/7/2016 5.347890 -6.898281 none TRUE
1/8/2016 7.983489 4.289382 none TRUE
1/9/2016 8.488293 -19.28304 none TRUE
我想在-22或更小的每个date
之前找到第一个discharge
值为7.7或更大的event
。换句话说,我知道每个event
都感兴趣。我想向后迭代搜索,以找到每个选定的discharge
之前的第一个event
值7.7或更大。
我基本上是在尝试将Extract rows for the first occurrence of a variable in a data frame与Select row prior to first occurrence of an event by group结合起来,但是很难。
所需的结果将是df[1, ]
,因为它包含我选择的第5行中的discharge
之前的第一个event
值(向后工作)超过7.7。 / p>
答案 0 :(得分:0)
这不是最优雅的解决方案,但适用于示例。
这首先定义了外观间隔(每个event < -22
一个间隔)。然后寻找discharge > 7.7
在此示例中,我假设您不想在event < -22
和discharge > 7.7
处查找行,即使这是自上次事件以来discharge > 7.7
的第一次出现>
df <- read.csv(text = 'date discharge event event.isolation some.column
1 1/1/2016 7.782711 NA <NA> FALSE
2 1/2/2016 7.349389 -5.567748 none TRUE
3 1/3/2016 7.053813 -4.021769 none TRUE
4 1/4/2016 7.421568 5.213554 none TRUE
5 1/5/2016 5.722443 -22.894418 none TRUE
6 1/6/2016 5.497342 -3.933662 none TRUE
7 1/7/2016 5.347890 -6.898281 none TRUE
8 1/8/2016 7.983489 4.289382 none TRUE',sep="")
## look which rows have a value for event < 22 and also include row 0 to define the first interval to look
d <- c(0,which(df$event < -22))
## Each interval is defined as d[i] to d[i+1], where intervals are skipped where these are equal (because then you would return rows where both event < -22 and discharge > 7.7
new.df <- NULL
for(i in 1:(length(d)-1)) {
if(d[i+1] > (d[i] + 1)) {
## this will look only in the interval and return the first row for which the condition discharge>7.7 is TRUE
new.df <- subset(df[(d[i]+1):(d[i+1]-1),], discharge>7.7)[1,]
}
}