有组织/清洁数据框架

时间:2017-05-03 18:32:07

标签: r database

我需要管理以下数据:

data <- data.frame(Name=c("11C","11C","11C","11C","11C","20D","20D"),
              PID=c("AD15E","AD15E","AD15E","AA05D","AA05D","Z48J","Z48J"),
              Type=c("Home","Auto","Auto","Home","Auto","Auto","Home"),
              Brand=c("A","B","C","H","I","P","D"),
              Model=c("A152","K235","W54","H2","A57","Z23","Y0878"))

通过唯一的Name和PID,我想将行中的数据转换为列。 PID“AA05D”有两个类型“自动”,所以我想将第二行转换为它自己的列。

我不确定我能用什么来实现这一目标。

我正在寻找干净的数据,如下所示:

result <- data.frame(Name=c("11C","11C","20D"),
               PID=c("AD15E","AA05D","Z48J"),
               Home.Brand=c("A","H","D"),
               Home.Model=c("A152","H2","Y0878"),
               Auto1.Brand=c("B","I","P"),
               Auto1.Model=c("K235","A57","Z23"),
               Auto2.Brand=c("C","",""),
               Auto2.Model=c("W54","",""))

1 个答案:

答案 0 :(得分:1)

如何使用data.table&#39; s dcast

library(data.table)

data$count <- ave(1:nrow(data), data$Name, data$PID, data$Type, FUN = function(x) 1:length(x))
dcast(setDT(data), Name + PID ~ Type + count, value.var = c("Brand", "Model"))
#   Name   PID Brand_Auto_1 Brand_Auto_2 Brand_Home_1 Model_Auto_1 Model_Auto_2 Model_Home_1
#1:  11C AA05D            I           NA            H          A57           NA           H2
#2:  11C AD15E            B            C            A         K235          W54         A152
#3:  20D  Z48J            P           NA            D          Z23           NA        Y0878