我有一张桌子,可以捕捉网站用户的互动。列'id'是用户的唯一标识符。 'time'是当前交互与最后一次交互的时差。 'conv'表示用户转换的步骤(1,0)。用户可能会多次转换或根本不转换。我需要标记一个会话计数器,其逻辑如下:
当id改变时,计数器应该重置为1.同样在用户转换(即conv = 1)并且'time'大于10之后,计数器应该重置为1。 虚拟数据框看起来像这样:
df <- data.frame(id = c(1,1,1,1,1,1,1,1,1,1,1,2,2,2), conv = c(0,0,0,0,1,0,0,0,0,1,0,0,0,0), time= c(0,3,15,18,9,5,17,7,15,5,5,45,40,5))
id |conv |time
----
1 | 0 | 0
1 | 0 | 3
1 | 0 | 15
1 | 0 | 18
1 | 1 | 9
1 | 0 | 5
1 | 0 | 17
1 | 0 | 7
1 | 0 | 15
1 | 1 | 5
1 | 0 | 5
2 | 0 | 0
2 | 0 | 40
2 | 0 | 5
决赛桌应该是这样的:
id |conv |time | counter
----
1 | 0 | 0 | 1
1 | 0 | 3 | 1
1 | 0 | 15 | 2
1 | 0 | 18 | 3
1 | 1 | 9 | 3
1 | 0 | 5 | 3
1 | 0 | 17 | 1
1 | 0 | 7 | 1
1 | 0 | 15 | 2
1 | 1 | 5 | 2
1 | 0 | 5 | 2
2 | 0 | 0 | 1
2 | 0 | 40 | 2
2 | 0 | 5 | 2
答案 0 :(得分:0)
您可能希望遍历行。这是一个示例(不确定它是否为您提供了您想要的确切结果,示例有点令人困惑,但您可以使用相同的方法):
df <- data.frame(id= c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2),
conv= c(0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0),
time= c(0, 3, 15, 18, 9, 5, 17, 7, 15, 5, 5, 45, 40, 5),
counter= numeric(0))
for (i in c(2:nrow(df))){ # For each row in the data frame (starting at row #2):
if (df[i, "id"] == df[i - 1, "id"]){ # If current row's ID equals previous rows ID..
# definitions for clarity
time <- df[i, "time"]
prev_time <- df[i - 1, "time"]
if (abs(time - prev_time) < 10){ # if absolute diff less than 10
df[i, "counter"] <- df[i - 1, "counter"] # current time = previous time
} else {
df[i, "counter"] <- df[i, "counter"] + 1 # current time incremented by 1
}
# check conv
if (df[i, "conv"] == 1 & df[i, "time"] > 10){
df[i, "counter"] <- 1
}
} else { # If the ID numbers don't match
df[i, "counter"] <- 1 # set counter to 1
}
}