行中的R列

时间:2016-05-03 23:03:19

标签: r list transpose

我有一个列表,我想将其转换为数据表 数据如下:

"Customer"
"Steve"
"AddressLine"
"2041"
"Total"
"10"
"MailItemInfo"
"None"
"Customer"
"Mike"
"AddressLine"
"2043"
"Total"
"20"
"MailItemInfo"
"Yes"
"Customer"
"Jenn"
"AddressLine"
"1132"
"Total"
"24"
"MailItemInfo"
"Yes"

此模式最终会重复5个不同的地址。我想把所有其他行放入一个新列,例如:

"Customer"  "AddressLine"  "BatchTotal"  "MailItemInfo" 
"Steve"  "2041"  "10"  "None"
"Mike"  "2043"  "20"  "Yes"
"Jenn"  "1132" "24" "Yes"  

我希望每当这些新的“列”继续出现在数据列表中时,它们会自动填充到新列中。任何可以提供解决方案的人都会真正提供帮助!

2 个答案:

答案 0 :(得分:0)

l <- list("Customer",
          "Steve",
          "AddressLine",
          "2041",
          "Total",
          "10",
          "MailItemInfo",
          "None",
          "Customer",
          "Mike",
          "AddressLine",
          "2043",
          "Total",
          "20",
          "MailItemInfo",
          "Yes",
          "Customer",
          "Jenn",
          "AddressLine",
          "1132",
          "Total",
          "24",
          "MailItemInfo",
          "Yes")

如果结构没有变化,这可行:

data.frame(matrix(unlist(l), ncol=8, byrow=TRUE))[c(2,4,6,8)]

     X2   X4 X6   X8
1 Steve 2041 10 None
2  Mike 2043 20  Yes
3  Jenn 1132 24  Yes

但是如果您想要在显示列时添加列,则可能需要使用已融合的data.frame。实际上iy = t更容易用data.table执行此操作。

首先,您必须确定新行的开头是什么,例如“客户”:

l <- list("Customer",
      "Steve",
      "AddressLine",
      "2041",
      "Total",
      "10",
      "MailItemInfo",
      "None",
      "Customer",
      "Mike",
      "AddressLine",
      "2043",
      "Total",
      "20",
      "MailItemInfo",
      "Yes",
      "Customer",
      "Jenn",
      "AddressLine",
      "1132",
      "Total",
      "24",
      "MailItemInfo",
      "Yes",
      "NewColumn",
      "xxx")

library(data.table)
dt <- data.table(matrix(unlist(l), ncol=2, byrow=TRUE)) # melted data.table
dt[V1=='Customer', id:=.I] # add id attribute to each "Customer"
dt[, id := id[1], by = cumsum(!is.na(id))] # set the id to following attributes
dcast(dt,id~V1, value.var="V2", fill=NA) # reverse the melted to large data.table

   id AddressLine Customer MailItemInfo NewColumn Total
1:  1        2041    Steve         None        NA    10
2:  2        2043     Mike          Yes        NA    20
3:  3        1132     Jenn          Yes       xxx    24

答案 1 :(得分:0)

假设上面的列表名为:“mylist”。试试这个:

titleseq<-seq(1, 8, by =2)
titles<-droplevels(mylist[titleseq,1])

nameseq<-seq(2, 40, by=8)
names<-droplevels(mylist[nameseq,1])
addres<-droplevels(mylist[(nameseq+2),1])
tot<-droplevels(mylist[(nameseq+4),1])
mailitem<-droplevels(mylist[(nameseq+6),1])

df<-data.frame(names, addres, tot, mail item)
names(df)<-titles

这是非常粗野的强迫,但是应该这样做。