这是我在提取列表列表的特定子集时提出的先前和类似问题的直接跟进:Extracting data from a list of lists into its own `data.frame` with `purrr`
因此我将使用相同的样本数据集:
l <- list(structure(list(a = -1.54676469632688, b = "s", c = "T",
d = structure(list(id = 5L, label = "Utah", link = "Asia/Anadyr",
score = -0.21104594634643), .Names = c("id", "label", "link", "score")), e = 49.1279871269422), .Names = c("a", "b", "c", "d", "e")), structure(list(a = -0.934821052832427, b = "k", c = "T", d = list(structure(list(id = 8L, label = "South Carolina", link = "Pacific/Wallis", score = 0.526540892113734, externalId = -6.74354377676955), .Names = c("id", "label", "link", "score", "externalId")), structure(list(id = 9L, label = "Nebraska", link = "America/Scoresbysund", score = 0.250895465294041, externalId = 16.4257470807879), .Names = c("id", "label", "link", "score", "externalId"))), e = 52.3161400117052), .Names = c("a", "b", "c", "d", "e")), structure(list(a = -0.27261485993069, b = "f", c = "P", d = list(structure(list(id = 8L, label = "Georgia", link = "America/Nome", score = 0.526494135483816, externalId = 7.91583574935589), .Names = c("id", "label", "link", "score", "externalId")), structure(list(id = 2L, label = "Washington", link = "America/Shiprock", score = -0.555186440792989, externalId = 15.0686663219837), .Names = c("id", "label", "link", "score", "externalId")), structure(list(id = 6L, label = "North Dakota", link = "Universal", score = 1.03168296038975), .Names = c("id", "label", "link", "score")), structure(list(id = 1L, label = "New Hampshire", link = "America/Cordoba", score = 1.21582056168681, externalId = 9.7276418869132), .Names = c("id", "label", "link", "score", "externalId")), structure(list(id = 1L, label = "Alaska", link = "Asia/Istanbul", score = -0.23183264861979), .Names = c("id", "label", "link", "score")), structure(list(id = 4L, label = "Pennsylvania", link = "Africa/Dar_es_Salaam", score = 0.590245339334121), .Names = c("id", "label", "link", "score"))), e = 132.1153538536), .Names = c("a", "e")), structure(list(a = 0.202685974077313, b = "x", c = "O", d = structure(list(id = 3L, label = "Delaware", link = "Asia/Samarkand", score = 0.695577130634724, externalId = 15.2364820698193), .Names = c("id", "label", "link", "score", "externalId")), e = 97.9908914452971), .Names = c("a", "b", "c", "d", "e")), structure(list(a = -0.396243444741009, b = "z", c = "P", d = list(structure(list(id = 4L, label = "North Dakota", link = "America/Tortola", score = 1.03060272795705, externalId = -7.21666936522344), .Names = c("id", "label", "link", "score", "externalId")), structure(list(id = 9L, label = "Nebraska", link = "America/Ojinaga", score = -1.11397997280413, externalId = -8.45145052697411), .Names = c("id", "label", "link", "score", "externalId"))), e = 123.597945533926), .Names = c("a", "b", "c", "d", "e")))
我试图解决的一般问题是提取具有不同长度的嵌套列表的内容,并将它们绑定到同一列表中的其他内容,这些内容基本上被用作嵌套内容的ID。
在上面的示例数据集的上下文中,我试图将子列表d
的内容提取到data.table
/ data.frame
,但也提取并基本上重复数据每个元素a
- 这样我就可以理解d
中哪些提取的元素属于同一个子集,因为它们的长度不同。所需data.table
的示例将最好地解释:
a id label link score externalId
-1.5467647 5 Utah Asia/Anadyr -0.2110459 NA
-0.9348211 8 South Carolina Pacific/Wallis 0.5265409 -6.743544
-0.9348211 9 Nebraska America/Scoresbysund 0.2508955 16.42575
请注意,第一列a
是l
中第一个子列表的内容。第一行是d
中第一个嵌套项的内容(长度为1),然后第二行和第三行是d
中第二个项的内容(长度为2),因此{ {1}}与a
相同。
目前,我实现这一目标的解决方案是一种全面的方式,并且容易出错 - 并且考虑到与上述参考文章的关系,我想知道我是否不理解能够将其扩展到这个相关的问题。
答案 0 :(得分:4)