将2个字段重新整形为边距并计算它们的交集

时间:2014-07-18 02:08:37

标签: r reshape

我正在尝试重塑具有此结构的销售数据框架

 categorization_one   gender       created_at       fk_id_customer_info    CONCAT_gender_cat_one  
1      Toys           Feminino     13/11/2013 04:54         1                   ToysFemale
2      Toys          Masculino     13/11/2013 04:54         2                   Toys Male
3 Computers         Masculino      14/11/2013 04:54         2                   Toys Male

我想像excel数据透视表一样,行包含categorization_one字段,列为CONCAT_gender_cat_one字段内的值。此表中的值将是categorization_one和CONCAT_gender_cat_one的交集之间的计数。 我尝试使用以下代码重新整形包:

cast(compras.parte.1,fk_id_customer_info ~ categorization_one, count, margins = TRUE)

但是我收到了这个错误:

incorrect number of dimensions

编辑,这是结果的复制/粘贴:dput(droplevels(head(compras.parte.1)))

structure(list(categorization_one = structure(c(4L, 2L, 3L, 2L, 
1L, 5L), .Label = c("Bebês/Alimentação/Mamadeiras", "Brinquedos/Desenhos e Pintura", 
"Brinquedos/Games e Eletrônicos/Laptops, Tablets e Cia", "Brinquedos/Primeira Infância", 
"Calçados/Sapatilhas"), class = "factor"), gender = structure(c(1L, 
2L, 2L, 2L, 3L, 1L), .Label = c("Feminino", "Masculino", "Masculino/Feminino"
), class = "factor"), created_at = structure(c(1L, 1L, 1L, 1L, 
1L, 1L), .Label = "13/11/2013 04:54", class = "factor"), fk_id_customer_info = structure(c(1L, 
2L, 2L, 2L, 3L, 1L), .Label = c("2", "3", "5"), class = "factor"), 
    GENDER_PESQUISA = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "não respondeu", class = "factor"), 
    cat_one_gender = structure(c(4L, 2L, 3L, 2L, 1L, 5L), .Label = c("Bebês/Alimentação/MamadeirasMasculino/Feminino", 
    "Brinquedos/Desenhos e PinturaMasculino", "Brinquedos/Games e Eletrônicos/Laptops, Tablets e CiaMasculino", 
    "Brinquedos/Primeira InfânciaFeminino", "Calçados/SapatilhasFeminino"
    ), class = "factor")), .Names = c("categorization_one", "gender", 
"created_at", "fk_id_customer_info", "GENDER_PESQUISA", "cat_one_gender"
), row.names = c(NA, 6L), class = "data.frame")

1 个答案:

答案 0 :(得分:1)

首先,这里的数据采用更友好的data.frame形式

dd<-structure(list(categorization_one = structure(c(2L, 2L, 1L), .Label = c("Computers", 
"Toys"), class = "factor"), gender = structure(c(1L, 2L, 2L), .Label = c("Feminino", 
"Masculino"), class = "factor"), created_at = structure(c(1384336440, 
1384336440, 1384422840), class = c("POSIXct", "POSIXt"), tzone = ""), 
    fk_id_customer_info = c(1L, 2L, 2L), CONCAT_gender_cat_one = structure(c(1L, 
    2L, 2L), .Label = c("Toys Female", "Toys Male"), class = "factor")), .Names = c("categorization_one", 
"gender", "created_at", "fk_id_customer_info", "CONCAT_gender_cat_one"
), row.names = c("1", "2", "3"), class = "data.frame")

然后听起来你只想要一张简单的桌子

with(dd, table(categorization_one, CONCAT_gender_cat_one))

#                   CONCAT_gender_cat_one
# categorization_one Toys Female Toys Male
#          Computers           0         1
#          Toys                1         1