编辑

Question

我有一个数据重新编码难题。这是我的示例数据的样子：

df <- data.frame(
  id = c(1,1,1,1,1,1,1, 2,2,2,2,2,2, 3,3,3,3,3,3,3),
  scores = c(0,1,1,0,0,-1,-1, 0,0,1,-1,-1,-1, 0,1,0,1,1,0,1),
  position = c(1,2,3,4,5,6,7, 1,2,3,4,5,6, 1,2,3,4,5,6,7),
  cat = c(1,1,1,1,1,0,0, 1,1,1,0,0,0, 1,1,1,1,1,1,1))

   id scores position cat
1   1      0        1   1
2   1      1        2   1
3   1      1        3   1
4   1      0        4   1
5   1      0        5   1
6   1     -1        6   0
7   1     -1        7   0
8   2      0        1   1
9   2      0        2   1
10  2      1        3   1
11  2     -1        4   0
12  2     -1        5   0
13  2     -1        6   0
14  3      0        1   1
15  3      1        2   1
16  3      0        3   1
17  3      1        4   1
18  3      1        5   1
19  3      0        6   1
20  3      1        7   1

数据集中有三个 ID，行由 positon 变量排序。对于每个id，scores start by -1后面的第一行需要是0，cat变量需要是1。例如，对于 id=1，第一行是 6th 位置，在该行中，score 应该是 0，cat 变量需要 1。对于那些没有 scores=-1 的 ID，我将它们保持原样。

所需的输出应如下所示：

   id scores position cat
1   1      0        1   1
2   1      1        2   1
3   1      1        3   1
4   1      0        4   1
5   1      0        5   1
6   1      0        6   1
7   1     -1        7   0
8   2      0        1   1
9   2      0        2   1
10  2      1        3   1
11  2      0        4   1
12  2     -1        5   0
13  2     -1        6   0
14  3      0        1   1
15  3      1        2   1
16  3      0        3   1
17  3      1        4   1
18  3      1        5   1
19  3      0        6   1
20  3      1        7   1

有什么推荐吗？？谢谢

Answer 1

您可以使用 dplyr 包执行以下操作：

library(dplyr)

df = mutate(df, cat = ifelse(scores == -1, 1, cat),
                scores = ifelse(scores == -1, 0, scores))

使用 mutate() 函数，我根据 scores 条件语句重新分配 cat 和 ifelse() 字段的值。对于分数，如果分数为-1，则将该值替换为0，否则保持分数不变。对于 cat，它还检查分数是否等于 -1，但在满足条件时分配值为 1，或者在不满足条件时分配现有的 cat 值。< /p>

编辑

经过我们在评论中的讨论，我认为这些方面的内容应该会有所帮助（您可能需要修改逻辑，因为我没有完全遵循此处所需的输出）：

for(i in 1:nrow(df)){
    # Check if score is -1
    if(df[i, 'scores'] == -1){
        # Update values for the next row
        df[i+1, 'scores'] <- 0
        df[i+1, 'cat'] <- 1
    }
}

抱歉，我没有真正遵循所需的输出，希望这有助于您找到答案！

Answer 2

这可能就是你所追求的

df %>% 
group_by(id) %>%
mutate(i = which(scores == -1)[1]) %>% # find the first row == -1
mutate(scores = case_when(position == i & scores !=0 ~ 0, T ~ scores), # update the score using position & i
cat = ifelse(scores == -1,0,1)) %>% # then update cat
select (-i) # remove I

Answer 3

在尝试了一些事情并从@Ricky 和@e.matt 那里得到想法后，我想出了一个解决方案。

Realm

这提供了我想要的输出。

UsersResource usersResource = keycloak.realm("myRealm").users();
...
List<UserRepresentation> users = usersResource.list();     
users.forEach(u -> System.out.println(u.getUsername()));

Answer 4

这是一个 data.table oneliner

library( data.table )
setDT(df)
df[ df[, .(cumsum( scores == -1 ) == 1), by = .(id)]$V1, `:=`( scores = 0, cat = 1) ]

#     id scores position cat
#  1:  1      0        1   1
#  2:  1      1        2   1
#  3:  1      1        3   1
#  4:  1      0        4   1
#  5:  1      0        5   1
#  6:  1      0        6   1
#  7:  1     -1        7   0
#  8:  2      0        1   1
#  9:  2      0        2   1
# 10:  2      1        3   1
# 11:  2      0        4   1
# 12:  2     -1        5   0
# 13:  2     -1        6   0
# 14:  3      0        1   1
# 15:  3      1        2   1
# 16:  3      0        3   1
# 17:  3      1        4   1
# 18:  3      1        5   1
# 19:  3      0        6   1
# 20:  3      1        7   1

按 r 中的顺序重新编码

4 个答案:

编辑