这与多个重复项(1,2,3)有关,但是我遇到了一个稍有不同的问题。到目前为止,我只见过熊猫solution。
在此数据表中:
<script>
$(document).ready(function(){
function make_chat_dialog_box(to_user_id, to_user_name)
{
var modal_content = '<div id=user_dialog>..</div>';
$('#user_model_details').append(modal_content);
$(document).on("click", '.chat_message', function(e){
e.preventDefault();
var to_user_id = $(this).data('touserid');
$('.popupbox').css("display", "block");
})
}
});
</script>
我想枚举每个组的唯一类以获取此信息:
dt = data.table(gr = rep(letters[1:2], each = 6),
cl = rep(letters[1:4], each = 3))
gr cl
1: a a
2: a a
3: a a
4: a b
5: a b
6: a b
7: b c
8: b c
9: b c
10: b d
11: b d
12: b d
答案 0 :(得分:3)
您可以(可能需要先对数据进行排序):
dt[, id := cumsum(!duplicated(cl)), by = gr]
gr cl id
1: a a 1
2: a a 1
3: a a 1
4: a b 2
5: a b 2
6: a b 2
7: b c 1
8: b c 1
9: b c 1
10: b d 2
11: b d 2
12: b d 2
与dplyr
相同:
dt %>%
group_by(gr) %>%
mutate(id = cumsum(!duplicated(cl)))
或类似rleid()
的可能性:
dt %>%
group_by(gr) %>%
mutate(id = with(rle(cl), rep(seq_along(lengths), lengths)))
答案 1 :(得分:3)
尝试
library(data.table)
dt[, id := rleid(cl), by=gr]
dt
# gr cl id
# 1: a a 1
# 2: a a 1
# 3: a a 1
# 4: a b 2
# 5: a b 2
# 6: a b 2
# 7: b c 1
# 8: b c 1
# 9: b c 1
#10: b d 2
#11: b d 2
#12: b d 2
答案 2 :(得分:0)
使用factor
的替代解决方案,无需先订购
dt %>%
group_by(gr) %>%
mutate(id = as.numeric(factor(cl))) %>%
ungroup()
# # A tibble: 12 x 3
# gr cl id
# <chr> <chr> <dbl>
# 1 a a 1
# 2 a a 1
# 3 a a 1
# 4 a b 2
# 5 a b 2
# 6 a b 2
# 7 b c 1
# 8 b c 1
# 9 b c 1
#10 b d 2
#11 b d 2
#12 b d 2
请注意,这将根据cl
组中每个gr
值的字母顺序自动分配一个数字/ id。