我更新了这个问题,因为a)我在第一次尝试时没有明确表达问题,b)我的确切需要也有所改变。
到目前为止,我特别感谢Hemmo提供了很大帮助 - 并且对于没有明确表达我的问题而道歉。他的代码(解决了早期版本的问题)显示在答案部分。
在高层 - 我正在寻找有助于识别和区分不同个体的连续空闲时间块的代码。更具体地说 - 理想情况下代码:
希望在查看示例数据框时(参见所需的最终列)
,这一点会变得更加清晰任何帮助都非常感谢,以下是测试数据框的代码。
非常感谢,
W
示例(请注意,最后一列应该由代码生成,纯粹包含在插图中):
Week Name Activity Hours Desired_Outcome
1 01/01/2013 Paul Free 40 1
2 08/01/2013 Paul Free 10 1
3 08/01/2013 Paul Project A 30 0
4 15/01/2013 Paul Project B 30 0
5 15/01/2013 Paul Project A 10 0
6 22/01/2013 Paul Free 40 2
7 29/01/2013 Paul Project B 40 0
8 05/02/2013 Paul Free 40 3
9 12/02/2013 Paul Free 10 3
10 19/02/2013 Paul Free 30 3
11 01/01/2013 Kim Project E 40 0
12 08/01/2013 Kim Free 40 4
13 15/01/2013 Kim Free 40 4
14 22/01/2013 Kim Project E 40 0
15 29/01/2013 Kim Free 40 5
数据帧代码:
Name=c(rep("Paul",10),rep("Kim",5))
Week=c("01/01/2013","08/01/2013","08/01/2013","15/01/2013","15/01/2013","22/01/2013","29/01/2013","05/02/2013","12/02/2013","19/02/2013","01/01/2013","08/01/2013","15/01/2013","22/01/2013","29/01/2013")
Activity=c("Free","Free","Project A","Project B","Project A","Free","Project B","Free","Free","Free","Project E","Free","Free","Project E","Free")
Hours=c(40,10,30,30,10,40,40,40,10,30,40,40,40,40,40)
Desired_Outcome=c(1,1,0,0,0,2,0,3,3,3,0,4,4,0,5)
df=as.data.frame(cbind(Week,Name,Activity,Hours,Desired_Outcome))
df
答案 0 :(得分:2)
编辑:由于问题被多次编辑,这已经很混乱了,所以我删除了旧答案。
checkFree<-function(df){
df$Week<-as.Date(df$Week,format="%d/%m/%Y")
df$outcome<-numeric(nrow(df))
if(df$Activity[1]=="Free"){ #check first
counter<-1
df$outcome[1]<-counter
} else counter<-0
for(i in 2:nrow(df)){
if(df$Activity[i]=="Free"){
LastWeek <- (df$Week >= (df$Week[i]-7) &
df$Week < (df$Week[i]))
if(all(df$Activity[LastWeek]!="Free"))
counter<-counter+1
df$outcome[i]<-counter
}
}
df
}
splitdf<-split(df, Name)
df<-unsplit(lapply(splitdf,checkFree),Name)
uniqs<-unique(df2$Name) #for renumbering
for(i in 2:length(uniqs))
df$outcome[df$Name==uniqs[i] & df$outcome>0]<-
max(df$outcome[df$Name==uniqs[i-1]]) +
df$outcome[df$Name==uniqs[i] & df$outcome>0]
df
应该这样做,尽管上面的代码可能远非最佳。
答案 1 :(得分:1)
使用用户1885116的评论作为Hemmo的答案作为所需内容的指南,这是一个更简单的方法:
N <- 1
x <- with(df, df[Activity=='Free',])
y <- with(x, diff(Week)) <= N*7
df$outcome <- 0
df[rownames(x[c(y, FALSE) | c(FALSE, y),]),]$outcome <- 1
df
## Week Activity Hours Desired_Outcome outcome
## 1 2013-01-01 Project A 40 0 0
## 2 2013-01-08 Project A 10 0 0
## 3 2013-01-08 Free 30 1 1
## 4 2013-01-15 Project B 30 0 0
## 5 2013-01-15 Free 10 1 1
## 6 2013-01-22 Project B 40 0 0
## 7 2013-01-29 Free 40 0 0
## 8 2013-02-05 Project C 40 0 0