如何取消嵌套在data.frame列中的列表的嵌套?

时间:2019-02-07 15:46:53

标签: r dplyr nested-lists purrr

我是使用嵌套列表的新手,所以我希望提供的解决方案还可以对如何做提供一些评论。我有一个使用jsonlite抓取的嵌套列表。我该如何获取所有团队的列表数据,并将其绑定到一个data.frame中?该列表在下面设置。我复制了列表的一个元素(用于1个团队)

这是我用来粘贴到以下列表的代码。我只是在显示,以便提供列表的设置方式。

json <-
  url %>%
  fromJSON(simplifyDataFrame = T)


df <- json$body$rosters

# DF with each team showing up on it's own line, but nested lists in players
df_teams <- df$teams

# One teams worth of data 
JSON_list <- df_teams[1, ]

我的列表内容如下。

JSON_list <- structure(list(
  projected_points = NA, long_abbr = "KE", lineup_status = "ok",
  short_name = "Kramerica", total_roster_salary = 22L, division = "",
  players = list(structure(list(
    firstname = c(
      "Jonathan", "Anthony"
    ), wildcards = structure(list(
      contract = c("1", "1"),
      salary = c("1", "21")
    ), class = "data.frame", row.names = c(
      NA,
      2L
    )), on_waivers = c(
      0L, 0L
    ), photo = c(
      "http://sports.cbsimg.net/images/baseball/mlb/players/170x170/1657581.png",
      "http://sports.cbsimg.net/images/baseball/mlb/players/170x170/1670417.png"
    ),
    eligible_for_offense_and_defense = c(0L, 0L),
    opponents = list(
      structure(list(
        game_id = c(
          "", ""
        ), weather_error = c(
          "Weather is not available for this game yet",
          "Weather is not available for this game yet"
        ),
        weather_icon_code = c(
          "", ""
        ), home_team = c("true", "true"),
        abbrev = c("OAK", "OAK"),
        time = c(
          1553803620L,
          1553911620L
        ),
        date = c(
          "20190328",
          "20190329"
        ), weather_icon_url = c(
          "", ""
        ), venue_type = c("", ""), game_abbr = c("", ""),
        weather = c("", ""), temperature = c(
          NA, NA
        )
      ), class = "data.frame", row.names = c(NA, 2L)),
      structure(list(game_id = c("", "", ""), weather_error = c(
        "Weather is not available for this game yet",
        "Weather is not available for this game yet", "Weather is not available for this game yet"
      ), weather_icon_code = c("", "", ""), home_team = c(
        "true",
        "true", "true"
      ), abbrev = c("TEX", "TEX", "TEX"), time = c(
        1553803500L,
        1553990700L, 1554062700L
      ), date = c(
        "20190328", "20190330",
        "20190331"
      ), weather_icon_url = c("", "", ""), venue_type = c(
        "",
        "", ""
      ), game_abbr = c("", "", ""), weather = c(
        "", "",
        ""
      ), temperature = c(NA, NA, NA)), class = "data.frame", row.names = c(
        NA,
        3L
      ))
    ), icons = structure(list(
      headline = c(
        "Angels' Jonathan Lucroy: Inks deal with Angels",
        NA
      ),
      hot = c(NA, 1L),
      cold = c(1L, NA),
      injury = c(
        "Knee: Questionable for start of season",
        NA
      )
    ), class = "data.frame", row.names = c(NA, 21L)), elias_id = c(
      "LUC758619", "RIZ253611"
    ), percentstarted = c(
      "48%", "97%"
    ),
    profile_link = c(
      "<a class='playerLink' aria-label=' Jonathan Lucroy C LAA' href='http://baseball.cbssports.com/players/playerpage/1657581'>Jonathan Lucroy</a> <span class=\"playerPositionAndTeam\">C | LAA</span> ",
      "<a class='playerLink' aria-label=' Anthony Rizzo 1B CHC' href='http://baseball.cbssports.com/players/playerpage/1670417'>Anthony Rizzo</a> <span class=\"playerPositionAndTeam\">1B | CHC</span>"
    ),
    id = c(
      "1657581", "1670417"
    ), pro_status = c(
      "A", "A"
    ), on_waivers_until = c(NA, NA), jersey = c("20", "44"),
    percentowned = c("61%", "99%"),
    pro_team = c(
      "LAA", "CHC"
    ), position = c(
      "C", "1B"
    ), lastname = c(
      "Lucroy", "Rizzo"
    ),
    roster_pos = c("C", "1B"),
    update_type = c("normal", "normal"),
    age = c(
      32L, 29L
    ), eligible = c(
      "C,U", "1B,U"
    ), is_locked = c(
      0L,
      0L
    ), bats = c(
      "R", "L"
    ), owned_by_team_id = c(
      12L, 12L
    ), ytd_points = c(
      0L, 0L
    ), roster_status = c(
      "A", "A"
    ), is_keeper = c(
      0L, 0L
    ), profile_url = c(
      "http://baseball.cbssports.com/players/playerpage/1657581",
      "http://baseball.cbssports.com/players/playerpage/1670417"
    ), fullname = c(
      "Jonathan Lucroy", "Anthony Rizzo"
    ), throws = c(
      "R",
      "L"
    ), headline = c(
      "Angels' Jonathan Lucroy: Inks deal with Angels",
      NA
    ), `starting-pitcher-today` = c(
      NA, "false"
    ), injury = c(NA, "Knee"), return = c(
      "Questionable for start of season",
      NA
    )
  ), class = "data.frame", row.names = c(NA, 2L))),
  name = "Kramerica Enterprises", logo = "http://baseball.cbssports.com/images/team-logo/main-36x36.jpg",
  abbr = "KE", point = "20190328", id = "12", active_roster_salary = 22L,
  warning = structure(list(description = NA_character_), row.names = 1L, class = "data.frame")
), row.names = 1L, class = "data.frame")

# Desired table sample (does not include all columns)
tibble::tribble(
  ~projected_points, ~long_abbr, ~lineup_status, ~short_name, ~total_roster_salary, ~division,               ~name, ~logo, ~abbr,  ~point5, ~active_roster_salary,    ~id2, ~firstname, ~contract, ~salary,
                 NA,       "KE",           "ok", "Kramerica",                   22,        NA, "Biloxi Blackjacks",    NA,  "KE", 20190328,                    22, 1657581, "Jonathan",         1,       1
  )                    

我遇到的问题是玩家列看起来像是嵌套的df,并且里面还有其他嵌套的df。具体来说:“通配符”,“对手”和“图标”。我正在寻找一个包含所有列的数据框。对于嵌套列表,我希望它们的内容显示为该特定播放器的列。即通配符,为“合同”和“工资”创建一列。另外,如果我想专门从JSON_list I.E.中选择列,该如何将列表绑定在一起。 "long_abbr"列中的{{1}列中的"lineup_status""firstname"等,以及wildcard中的"id"列中的其他列?

1 个答案:

答案 0 :(得分:0)

如果您具有嵌套结构,则可以使用[[]]隔离列表元素,并使用[]隔离列。如果行数相等,则可以使用cbind

直接创建数据框

让我们举一个可重复的例子

创建3个尺寸相似的数据框

 df1 <- data.frame(var1=c('a', 'b', 'c'), var2=c('d', 'e', 'f'), var3=1:3)
 df2 <- data.frame(var4=c('g', 'h', 'i'), var5=c('j', 'k', 'l'), var6=4:6)
 df3 <- data.frame(var7=c(6:8), var8=c('j', 'k', 'l'), var9=4:6)

将数据框放入嵌套列表结构中

 list <- list(df1,df2)
 nested.list <- list(list, df3)

制作一个由var2,var6和var7组成的绑定数据帧

binded.df <- cbind(nested.list[[1]][[1]][2],nested.list[[1]][[2]][3],nested.list[[2]][1])