我有一组与此类似的数据
a = data.table(
ID = c(1, 1, 2, 2, 2, 3, 3),
TOUR = c("USA", "CHINA", "CHINA", "CHINA", "EUROPE", "CANADA", "USA")
)
我希望聚合数据来创建它:
使用数据表.....有人能告诉我怎么做?
谢谢,菲尔,
答案 0 :(得分:3)
我们可以先创建“NUMBER_OF_BOOKINGS”,按“ID”分组.N
,即行数,然后dcast
fun.aggregate as
长度
dcast(a[, NUMBER_OF_BOOKINGS := .N, ID], ID + NUMBER_OF_BOOKINGS ~ TOUR, length)
# ID NUMBER_OF_BOOKINGS CANADA CHINA EUROPE USA
#1: 1 2 0 1 0 1
#2: 2 3 0 2 1 0
#3: 3 2 1 0 0 1
如果我们需要前缀"TOUR"
,请使用paste
dcast(a[, NUMBER_OF_BOOKINGS := .N, ID], ID + NUMBER_OF_BOOKINGS ~
paste0("TOUR_", TOUR), length)
上面的方法还会在我们分配(:=
)时在原始数据集中创建一个列。如果我们想避免这种情况,我们可以进行联接
a[, .(NUMBER_OF_BOOKINGS = .N), ID][dcast(a, ID ~ paste0("TOUR_", TOUR), length), on = .(ID)]
# ID NUMBER_OF_BOOKINGS TOUR_CANADA TOUR_CHINA TOUR_EUROPE TOUR_USA
#1: 1 2 0 1 0 1
#2: 2 3 0 2 1 0
#3: 3 2 1 0 0 1