如何在R中将JSON文件转换为数据框?

时间:2018-08-09 01:01:04

标签: r json dataframe

Link to data.

出于我的目的,我从上述链接下载了数据并将其另存为JSON文件。

json_convert <- do.call(rbind, lapply(paste(readLines("Myfile.json", warn=TRUE),
                         collapse=""), 
                   jsonlite::fromJSON))

到目前为止,我已经成功编写了上面的代码。但是,我对如何将其转换为数据帧感到困惑。感谢所有帮助。

1 个答案:

答案 0 :(得分:0)

让我们首先检查数据结构:

library(purrr)
library(tibble)
library(jsonlite)

my_json <- fromJSON("Myfile.json")
str(my_json)

List of 3
 $ resource  : chr "shotchartdetail"
 $ parameters:List of 30
  ..$ LeagueID      : chr "00"
  ..$ Season        : chr "2017-18"
  ..$ SeasonType    : chr "Regular Season"
  ..$ TeamID        : int 1610612750
  ..$ PlayerID      : int 0
  ..$ GameID        : NULL
  ..$ Outcome       : NULL
  ..$ Location      : NULL
  ..$ Month         : int 0
  ..$ SeasonSegment : NULL
  ..$ DateFrom      : NULL
  ..$ DateTo        : NULL
  ..$ OpponentTeamID: int 0
  ..$ VsConference  : NULL
  ..$ VsDivision    : NULL
  ..$ Position      : NULL
  ..$ RookieYear    : NULL
  ..$ GameSegment   : NULL
  ..$ Period        : int 0
  ..$ LastNGames    : int 0
  ..$ ClutchTime    : NULL
  ..$ AheadBehind   : NULL
  ..$ PointDiff     : NULL
  ..$ RangeType     : int 0
  ..$ StartPeriod   : int 1
  ..$ EndPeriod     : int 10
  ..$ StartRange    : int 0
  ..$ EndRange      : int 28800
  ..$ ContextFilter : chr "SEASON_YEAR='2017-18'"
  ..$ ContextMeasure: chr "FGA"
 $ resultSets:'data.frame': 2 obs. of  3 variables:
  ..$ name   : chr [1:2] "Shot_Chart_Detail" "LeagueAverages"
  ..$ headers:List of 2
  .. ..$ : chr [1:24] "GRID_TYPE" "GAME_ID" "GAME_EVENT_ID" "PLAYER_ID" ...
  .. ..$ : chr [1:7] "GRID_TYPE" "SHOT_ZONE_BASIC" "SHOT_ZONE_AREA" "SHOT_ZONE_RANGE" 
...
  ..$ rowSet :List of 2
  .. ..$ : chr [1:7063, 1:24] "Shot Chart Detail" "Shot Chart Detail" "Shot Chart 
Detail" "Shot Chart Detail" ...
  .. ..$ : chr [1:20, 1:7] "League Averages" "League Averages" "League Averages" "League Averages" ...

现在,您必须决定要在数据框中显示的内容。

我假设玩家统计信息位于$rowSet的第一个元素中(1:7063 =行,1:24 =列),而这些列的标题位于$resultSets$headers的第一个元素中(1:24)。

我确信在purrr中有一种非常优雅的方法来使用地图函数。不是吗,但是可以用:

my_list <- my_json %>% 
  flatten()

my_df <- my_list$rowSet[[1]] %>% 
  as.tibble() %>% 
  setNames(my_list$headers[[1]])

str(my_df)

Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   7063 obs. of  24 variables:
 $ GRID_TYPE          : chr  "Shot Chart Detail" "Shot Chart Detail" "Shot Chart Detail" "Shot Chart Detail" ...
 $ GAME_ID            : chr  "0021700011" "0021700011" "0021700011" "0021700011" ...
 $ GAME_EVENT_ID      : chr  "10" "12" "16" "21" ...
 $ PLAYER_ID          : chr  "1626157" "202710" "202710" "201959" ...
 $ PLAYER_NAME        : chr  "Karl-Anthony Towns" "Jimmy Butler" "Jimmy Butler" "Taj Gibson" ...
 $ TEAM_ID            : chr  "1610612750" "1610612750" "1610612750" "1610612750" ...
 $ TEAM_NAME          : chr  "Minnesota Timberwolves" "Minnesota Timberwolves" "Minnesota Timberwolves" "Minnesota Timberwolves" ...
 $ PERIOD             : chr  "1" "1" "1" "1" ...
 $ MINUTES_REMAINING  : chr  "11" "11" "10" "10" ...
 $ SECONDS_REMAINING  : chr  "14" "9" "32" "21" ...
 $ EVENT_TYPE         : chr  "Missed Shot" "Made Shot" "Missed Shot" "Missed Shot" 
...
 $ ACTION_TYPE        : chr  "Jump Shot" "Jump Shot" "Driving Reverse Layup Shot" "Jump Shot" ...
 $ SHOT_TYPE          : chr  "2PT Field Goal" "3PT Field Goal" "2PT Field Goal" "3PT Field Goal" ...
 $ SHOT_ZONE_BASIC    : chr  "Mid-Range" "Above the Break 3" "Restricted Area" "Left Corner 3" ...
 $ SHOT_ZONE_AREA     : chr  "Left Side Center(LC)" "Right Side Center(RC)" "Center(C)" "Left Side(L)" ...
 $ SHOT_ZONE_RANGE    : chr  "16-24 ft." "24+ ft." "Less Than 8 ft." "24+ ft." ...
 $ SHOT_DISTANCE      : chr  "20" "25" "1" "22" ...
 $ LOC_X              : chr  "-113" "199" "-11" "-225" ...
 $ LOC_Y              : chr  "169" "152" "6" "16" ...
 $ SHOT_ATTEMPTED_FLAG: chr  "1" "1" "1" "1" ...
 $ SHOT_MADE_FLAG     : chr  "0" "1" "0" "0" ...
 $ GAME_DATE          : chr  "20171018" "20171018" "20171018" "20171018" ...
 $ HTM                : chr  "SAS" "SAS" "SAS" "SAS" ...
 $ VTM                : chr  "MIN" "MIN" "MIN" "MIN" ...