我是 R 的新手,我想创建一个频率表,我有这样的数据,
Member_id Car interest
1 FORD_MUSTANG 4
1 BUICK_Lucerne 1
1 CHEVROLET_SILVERADO 1
2 CHEVROLET_SILVERADO 1
2 FORD_MUSTANG 2
3 FORD_MUSTANG 6
我想有一个像这样的频率表:
MEmber_id FORD_MUSTANG BUICK_Lucerne CHEVROLET_SILVERADO
1 4 1 1
2 2 0 1
3 6 0 0
我尝试过使用table(Member_id,car),但它为每个汽车制造商返回值1值。
感谢任何帮助。
答案 0 :(得分:2)
尝试
library(reshape2)
dcast(df, Member_id~Car, value.var="interest", fill=0)
# Member_id BUICK_Lucerne CHEVROLET_SILVERADO FORD_MUSTANG
#1 1 1 1 4
#2 2 0 1 2
#3 3 0 0 6
或者
library(tidyr)
spread(df, Car, interest, fill=0)
# Member_id BUICK_Lucerne CHEVROLET_SILVERADO FORD_MUSTANG
#1 1 1 1 4
#2 2 0 1 2
#3 3 0 0 6
如果要按照指定的顺序创建列
df$Car <- with(df, factor(Car, unique(Car)))
spread(df, Car, interest, fill=0)
# Member_id FORD_MUSTANG BUICK_Lucerne CHEVROLET_SILVERADO
#1 1 4 1 1
#2 2 2 0 1
#3 3 6 0 0
df <- structure(list(Member_id = c(1L, 1L, 1L, 2L, 2L, 3L), Car = c("FORD_MUSTANG",
"BUICK_Lucerne", "CHEVROLET_SILVERADO", "CHEVROLET_SILVERADO",
"FORD_MUSTANG", "FORD_MUSTANG"), interest = c(4L, 1L, 1L, 1L,
2L, 6L)), .Names = c("Member_id", "Car", "interest"), class = "data.frame", row.names = c(NA,
-6L))