如何将列表名称和值提取到数据框

时间:2018-10-30 08:41:58

标签: r json dplyr tidyverse tidyr

我正在使用Kaggles https://www.kaggle.com/c/two-sigma-connect-rental-listing-inquiries/data

json训练文件以分析功能和数据,并应用其他算法来检查是否可以提高准确性。

例如,我有一列:功能:

示例:

    l <- structure(list(`4` = c("Dining Room", "Pre-War", "Laundry in Building", 
"Dishwasher", "Hardwood Floors", "Dogs Allowed", "Cats Allowed"
), `6` = c("Doorman", "Elevator", "Laundry in Building", "Dishwasher", 
"Hardwood Floors", "No Fee"), `9` = c("Doorman", "Elevator", 
"Laundry in Building", "Laundry in Unit", "Dishwasher", "Hardwood Floors"
), `10` = list(), `15` = c("Doorman", "Elevator", "Fitness Center", 
"Laundry in Building")), .Names = c("4", "6", "9", "10", "15"
))

我想建立一个看起来像这样的数据框:

name     nested list
4        <list = list(c("Dining Room", "Pre-War", "Laundry in Building", 
"Dishwasher", "Hardwood Floors", "Dogs Allowed", "Cats Allowed"))>
6        <list = list(c("Doorman", "Elevator", "Laundry in Building", "Dishwasher", "Hardwood Floors", "No Fee"))>
9        <list = list(c("Doorman", "Elevator", 
"Laundry in Building", "Laundry in Unit", "Dishwasher", "Hardwood Floors"))>  
10       <list = list(c())>
15       <list = list(c("Doorman", "Elevator", "Fitness Center", 
"Laundry in Building")))>

请告知操作方法。

我有点困惑如何转换。

我的最终目标是建立一个将所有这些功能结合在一起的数据帧,如果每个具有这些功能的4、6、10、15 ...分别具有自己的1和0,则对它们进行一次热编码。 >

请告知。

1 个答案:

答案 0 :(得分:1)

一种方法是使用参数为data.table::rbindlist()的{​​{1}}函数。这使您可以绑定具有不同列数的数据帧。但是,在您的情况下,技巧是使空数据框也显示在其中。为此,我们添加了一条if语句,该语句为空列表元素(即

)创建了一个fill = TRUE数据框
NA

给出,

library(data.table)
rbindlist(lapply(l, function(i) {d <- as.data.frame(t(i)); 
                                if(!ncol(d)){d <- data.frame(V1 = NA)}; d}), fill = TRUE)