Question

我有一个要转换为数据帧的json文件。 json文件如下所示，并具有此模式。（json文件来自FB，您实际上可以以html或json格式下载整个朋友列表/个人资料。）

   {
    "friends": [
    {
      "name": "Archie Andrews",
      "timestamp": 1539780292
    },
    {
      "name": "Betty Cooper",
      "timestamp": 1539005874
    },
    {
      "name": "Veronica Lodge",
      "timestamp": 1537680925
    },
    {
       "name": "Sabrina Spellman",
       "timestamp": 1381680968,
       "contact_info": "creepyhouse@666.com"
        }
    ]
 }

通常，我可以使用以下代码将其转换为具有2列（名称，时间戳记）的数据框：

  library(rjson)
  friends <- fromJSON(file = "xxx.json")
  data_frame <- data.frame(matrix(unlist(friends), nrow = lengths(friends)+1, byrow = T), stringsAsFactors = FALSE)

但是，令人讨厌的事情是当他们像sabrina的示例那样具有contact_info时。发生的事情是，它也被提取出来，因此它使排列倾斜。因此，需要nrow = lengths（friends）+1

Archie Andrews      1539780292
Betty Cooper        1539005874
Veronica Lodge      1537680925
Sabrina Spellman    1381680968
creepyhouse@666.com Jughead Jones
1343582935          Midge Klump

有没有一种方法，当将列表提取为2列时，对于每个列表，我只会采用前两个元素（名称，时间戳记）？最终，我不在乎contact_info，我只想要一个2列的数据框。

Answer 1

如果我正确理解了您的问题，则可以在以后删除这些列。请注意，jsonlite::read_json或jsonlite::fromJSON将xxx.json文件转换为列表对象，该列表的第一个元素为data.frame。您可以使用[[子设置运算符从此列表中提取元素。

df <- jsonlite::read_json(path = "test.json", simplifyDataFrame = T)[[1]] ## note the "[[" subseting operator

df <- df[, c("name", "timestamp")] ## select the columns as desired

结果：

> df
              name  timestamp
1   Archie Andrews 1539780292
2     Betty Cooper 1539005874
3   Veronica Lodge 1537680925
4 Sabrina Spellman 1381680968

R将FB json好友列表转换为数据框

1 个答案: