我有一个驾驶实验的数据。随附的是我的数据框图片作为示例。 1
到目前为止,我目前有一些代码,该代码可以按参与者ID和试验编号列拆分数据框,搜索方向盘角度(SWA)列,并选择方向盘角度高于死区阈值的第一行并将其保存到一个新的数据框-每次试用一个数据框:
pilot_clean_new <- lapply(split(pilot_clean, list(pilot_clean$ppid, pilot_clean$trialn), drop = TRUE), function(data) {
i <- data[abs(data$SWA) > 0.01,] # find all observations that exceed threshold
if (nrow(i)==0) return(NULL) # handle cases where no observations meet critera
return(i[1,]) # return only the first match
})
pilot_clean_new <- do.call(rbind.data.frame, pilot_clean_new)
pilot_clean_new <- arrange(pilot_clean_new, ppid)
但是现在您可以从这张pilot_clean_new 2的图像中看到
我的时间戳记是连续的。因此,对于每次试验,我都有方向盘转角超过阈值的时间戳。我需要从每个试验编号的第一个时间戳中减去该时间戳,以使每个参与者在转向角高于阈值时获得“经过的时间”。
有人对如何做到这一点有任何建议吗?我的想法是使用原始数据集并使用某种形式的循环,使用head()选择每个试验的第一个时间戳,并将其从干净数据框中的当前时间戳中减去。
答案 0 :(得分:0)
我生成了一个样本数据集,我相信它可以复制所需的条件。否则请告知。
我使用dplyr执行大多数功能:
# load required libraries
library(magrittr)
library(dplyr)
# generate sample data
pilot_clean <-
base::data.frame(
ppid = base::c(base::rep(1,15), base::rep(2,15), base::rep(3,15))
, trialn = base::c(base::rep(1:3,15))
, SWA = base::sample(base::seq(0.00,0.02, by = .001), 45, replace = T)
) %>%
dplyr::arrange(ppid,trialn) %>%
dplyr::mutate(timestamp = base::sort(stats::runif(45,min=5, max=125)))
# set threshold
SWA_threshold = 0.01
# force null condition
pilot_clean[pilot_clean$ppid == 3 & pilot_clean$trialn == 3,"SWA"] <- SWA_threshold - .001
# determine first time in each ppid, trialn group
pilot_clean_first_time <-
pilot_clean %>%
dplyr::group_by(ppid,trialn) %>%
dplyr::filter(dplyr::row_number() == 1) %>%
dplyr::ungroup() %>%
dplyr::transmute(ppid, trialn, first_timestamp = timestamp) #use transmute to rename for future join, ungroup first to allow for column rename of grouping variable
# determine first time in each ppid, trialn group above threshold
pilot_clean_first_time_above_threshold <-
pilot_clean %>%
dplyr::group_by(ppid,trialn) %>%
dplyr::filter(SWA > SWA_threshold) %>%
dplyr::filter(dplyr::row_number() == 1) %>%
dplyr::ungroup() %>%
dplyr::transmute(ppid, trialn, first_timestamp_above_threshold = timestamp) #use transmute to rename for future join, ungroup first to allow for column rename of grouping variable
# get unique list of ppid and trialn (to enable left join and null condition)
pilot_ppid_trial_list <-
pilot_clean %>%
dplyr::select(ppid,trialn) %>%
unique()
# produce final result set with ppid, trialn, first time, and first time above threshold
pilot_clean_new <-
pilot_ppid_trial_list %>%
dplyr::left_join(pilot_clean_first_time) %>%
dplyr::left_join(pilot_clean_first_time_above_threshold) %>%
dplyr::mutate(adjusted_first_timestamp_above_threshold = first_timestamp_above_threshold - first_timestamp) # calculate final result