我正在尝试创建一个新变量,如果事件发生,那么我想基于1时间内的时间变量回顾所有先前的事件。我在下面有一些示例数据。我很失落,并且不知道从哪里开始。
event<-c("Dribble","Pass","Dribble","Bad Shot","Shot Miss","Rebound","Pass","Pump Fake","Good Shot","Shot Miss")
time<-c(1,2,3,4,5,6,6.5,6.9,6.92,6.95)
player_id<-c(1,1,2,2,2,1,1,2,2,2)
pass_to_shot<-c("","Pass to Shot","","","","","Pass to Shot","","","")
test_data<-data.frame(player_id,event,time,pass_to_shot)
player_id event time pass_to_short
1 Dribble 1 NA
1 Pass 2 Pass to Shot
2 Dribble 3 NA
2 Bad Shot 4 NA
2 Shot Miss 5 NA
1 Rebound 6 NA
1 Pass 6.5 Pass to Shot
2 Pump Fake 6.9 NA
2 Good Shot 6.92 NA
我希望它看起来像这样:
player_id event time pass_to_short chance_create
1 Dribble 1 NA
1 Pass 2 Pass to Shot
2 Dribble 3 NA
2 Bad Shot 4 NA
2 Shot Miss 5 NA
1 Rebound 6 NA
1 Pass 6.5 Pass to Shot 1
2 Pump Fake 6.9 NA
2 Good Shot 6.92 NA
我还没有真正了解如何在R数据集中引用过去的观察结果。基本上如果事件==&#34; Pass&#34;还有一个好的镜头&#34;事件在接下来的1秒(单位时间)然后我希望chance_create等于1.任何帮助都会很棒,谢谢!
答案 0 :(得分:0)
你可以dplyr
library(dplyr)
test_data %>% mutate(event_of_interest = ifelse(event == "Pass" | event == "GoodShot",1,0),
time_diff = c(diff(-time),NA),
chance_create = ifelse(abs(time_diff) < 1 & event_of_interest == 1,1,0))%>%
select(-event_of_interest,-time_diff)
输出:
player_id event time pass_to_shot chance_create
1 1 Dribble 1.00 0
2 1 Pass 2.00 Pass to Shot 0
3 2 Dribble 3.00 0
4 2 Bad Shot 4.00 0
5 2 Shot Miss 5.00 0
6 1 Rebound 6.00 0
7 1 Pass 6.50 Pass to Shot 1
8 2 Pump Fake 6.90 0
9 2 Good Shot 6.92 0
10 2 Shot Miss 6.95 0
虽然我不能100%确定我的代码是否健壮,即,我不确定它是否总是会给出所需的结果。
答案 1 :(得分:0)
这是另一个可能更强大的解决方案,但很难用当前数据来判断:
library(dplyr)
test_data %>%
filter(event %in% c("Pass", "Good Shot")) %>%
arrange(time, event) %>%
mutate(chance_create = ifelse((time - lead(time)) < 1 & lead(event) == "Good Shot", 1, NA)) %>%
select(player_id, chance_create, time) %>%
left_join(test_data, ., by = c("time", "player_id"))
答案 2 :(得分:0)
z1 <- test_data %>% filter(event == "Pass" | event == "Good Shot") %>%
mutate(time_diff = c(diff(time), NA),
chance_create = ifelse(event == "Pass" & lead(event) == "Good Shot" & time_diff <= 1, 1, 0)) %>%
select(-time_diff)
output <- merge(test_data, z1, by = c("player_id", "event", "time", "pass_to_shot"), all.x = T) %>%
arrange(time)
output$chance_create[is.na(output$chance_create)] <- 0
output
player_id event time pass_to_shot chance_create
1 Dribble 1.00 0
1 Pass 2.00 Pass to Shot 0
2 Dribble 3.00 0
2 Bad Shot 4.00 0
2 Shot Miss 5.00 0
1 Rebound 6.00 0
1 Pass 6.50 Pass to Shot 1
2 Pump Fake 6.90 0
2 Good Shot 6.92 0
2 Shot Miss 6.95 0