I hope someone can help me with my problem, I know using two for-loops is not very efficient but that was my first solution. I have a data frame (AllPat) with eye-patients (patient-id, date and visit ->'o'perations or 'c'heckups)
#Pat Date Visit
#1,l 2015-03-30 c
#1,l 2015-06-03 o
#1,l 2015-07-01 o
#1,l 2015-07-20 c
#1,l 2016-03-16 o
#1,l 2016-04-13 o
#1,l 2016-05-09 c
#2,l 2014-12-23 c
#2,l 2015-01-21 o
#2,l 2015-03-16 c
#2,l 2015-11-23 o
And I want to count the operation-blocks for each patient-id (before and after a checkup)
#Pat Date Visit Block
#1,l 2015-03-30 c
#1,l 2015-06-03 o 1
#1,l 2015-07-01 o 2
#1,l 2015-07-20 c
#1,l 2016-03-16 o 1
#1,l 2016-04-13 o 2
#1,l 2016-05-09 c
#2,l 2014-12-23 c
#2,l 2015-01-21 o 1
#2,l 2015-03-16 c
#2,l 2015-11-23 o 1
and that's the current code:
for(i in unique(AllPat$Pat)){
op <- 0
for(j in AllPat$Pat){
if(i == j) {
if(AllPat$Visit[AllPat$Pat == j] == "o") {
AllPat$Block[AllPat$Pat == j] <- op
op <- op+1
}
else op<-0
}
}
}
my problem is, that the values in $Block only get visible if I sort them by hand in the view of the data frame, maybe someone has a better solution and can help me
UPDATE: my current data frame with the suggested function rleid:
Patient Date Visit DiffDate Block
3,r 16.02.2016 m 0
3,r 16.02.2016 m 0 0
3,r 16.02.2016 m 0 0
3,r 16.02.2016 m 0 0
3,r 20.04.2016 o 64 1
3,r 18.05.2016 o 28 1 <<- should be 2
3,r 15.06.2016 o 28 1 <<- should be 3
3,r 04.07.2016 m 19 0
3,r 27.07.2016 o 23 1
3,r 24.08.2016 o 28 2
3,r 18.10.2016 o 55 3
maybe I should change my difftime function? The current code for counting the blocks is:
n <- nrow(AllPat)
AllPat<- transform(AllPat, Block = ave(1:n, rleid(Patient, Visit, (DiffDate<= 60)), FUN = seq_along) * (Visit== "o"))
and the difference between the dates:
setDT(AllPat)[, DiffDate:= difftime(AllPat$Date, shift(AllPat$Date), units = "days"), by = c("Patient")]
UPDATE
4,l 2015-05-18 m NA 0
4,l 2015-10-20 o 155 1
4,l 2016-05-31 o 224 2 <<-1
4,l 2016-07-26 o 56 1
答案 0 :(得分:1)
rleid
in the data.table package can help here. We have used 0 for the checkup blocks.
library(data.table)
AllPatDT <- data.table(AllPat)
AllPatDT[, Block := ave(.I, rleid(X.Pat, Visit), FUN = seq_along) * (Visit == "o")]
giving:
> AllPatDT
X.Pat Date Visit Block
1: #1,l 2015-03-30 c 0
2: #1,l 2015-06-03 o 1
3: #1,l 2015-07-01 o 2
4: #1,l 2015-07-20 c 0
5: #1,l 2016-03-16 o 1
6: #1,l 2016-04-13 o 2
7: #1,l 2016-05-09 c 0
8: #2,l 2014-12-23 c 0
9: #2,l 2015-01-21 o 1
10: #2,l 2015-03-16 c 0
11: #2,l 2015-11-23 o 1
If you prefer a straight data.frame then using only rleid
from the data.table package we have:
library(data.table)
n <- nrow(AllPat)
transform(AllPat, Block = ave(1:n, rleid(X.Pat, Visit), FUN = seq_along) * (Visit == "o"))
We have used the following as AllPat
:
Lines <- "#Pat Date Visit
#1,l 2015-03-30 c
#1,l 2015-06-03 o
#1,l 2015-07-01 o
#1,l 2015-07-20 c
#1,l 2016-03-16 o
#1,l 2016-04-13 o
#1,l 2016-05-09 c
#2,l 2014-12-23 c
#2,l 2015-01-21 o
#2,l 2015-03-16 c
#2,l 2015-11-23 o"
AllPat <- read.table(text = Lines, header = TRUE, comment.char = "", as.is = TRUE)
答案 1 :(得分:0)
I did a search "[r] sequence within groups" and found an answer that I was able to adapt with a trick I have (in all honesty probably learned from G.Grothendieck) for making groups. This is a link to an answer from Martin Morgan (a certified R guru)
I added that to my trick that forms groups at points where a condition occurs:
> dat$seq <- cumsum(dat$Visit=="c")
> dat
Pat Date Visit seq
1 1,l 2015-03-30 c 1
2 1,l 2015-06-03 o 1
3 1,l 2015-07-01 o 1
4 1,l 2015-07-20 c 2
5 1,l 2016-03-16 o 2
6 1,l 2016-04-13 o 2
7 1,l 2016-05-09 c 3
8 2,l 2014-12-23 c 4
9 2,l 2015-01-21 o 4
10 2,l 2015-03-16 c 5
11 2,l 2015-11-23 o 5
> rle <- rle(paste(dat$Pat, dat$seq, sep = "\r"))
> dat$Seq <- unlist(lapply(rle$length, seq_len))
> dat
Pat Date Visit seq Seq
1 1,l 2015-03-30 c 1 1
2 1,l 2015-06-03 o 1 2
3 1,l 2015-07-01 o 1 3
4 1,l 2015-07-20 c 2 1
5 1,l 2016-03-16 o 2 2
6 1,l 2016-04-13 o 2 3
7 1,l 2016-05-09 c 3 1
8 2,l 2014-12-23 c 4 1
9 2,l 2015-01-21 o 4 2
10 2,l 2015-03-16 c 5 1
11 2,l 2015-11-23 o 5 2
> rle <- rle(paste(dat$Pat, dat$seq, sep = "\r"))
> dat$Seq <- dat$Seq -1
> dat$Seq[dat$Seq==0] <- " "
> dat
Pat Date Visit seq Seq
1 1,l 2015-03-30 c 1
2 1,l 2015-06-03 o 1 1
3 1,l 2015-07-01 o 1 2
4 1,l 2015-07-20 c 2
5 1,l 2016-03-16 o 2 1
6 1,l 2016-04-13 o 2 2
7 1,l 2016-05-09 c 3
8 2,l 2014-12-23 c 4
9 2,l 2015-01-21 o 4 1
10 2,l 2015-03-16 c 5
11 2,l 2015-11-23 o 5 1