我有一个JSON文件,想将其解析为pandas DataFrame。我希望具有以下结构:
ID |大事记网站| ...
游戏1 | “额定子弹游戏” | https://lichess.org/spTUcy1Z | ...
Game2 | “额定闪电战” | https://lichess.org/kh6FJkyS | ...
但是使用df = pd.json_normalize(all_rows)
,df看起来像这样:
--- | Game1.Event | Game1.Site | ...
0 |额定子弹游戏| https://lichess.org/spTUcy1Z | ...
JSON如下所示:
{
"Game1": {
"Event": "Rated Bullet game",
"Site": "https://lichess.org/spTUcy1Z",
"Date": "2020.05.01",
"Round": "-",
"White": "AaravShah25",
"Black": "daksh_badhwar",
"Result": "0-1",
"UTCDate": "2020.05.01",
"UTCTime": "04:55:40",
"WhiteElo": "1360",
"BlackElo": "1342",
"WhiteRatingDiff": "-10",
"BlackRatingDiff": "+6",
"ECO": "C56",
"Opening": "Italian Game: Scotch Gambit, Nakhmanson Gambit",
"TimeControl": "60+0",
"Termination": "Normal",
"moves": "1. e4 { [%clk 0:01:00] } e5 { [%clk 0:01:00] } 2. d4 { [%clk 0:00:59] } Nc6 { [%clk 0:01:00] } 3. Nf3 { [%clk 0:00:58] } exd4 { [%clk 0:00:59] } 4. Bc4 { [%clk 0:00:58] } Nf6 { [%clk 0:00:56] } 5. O-O { [%clk 0:00:58] } Nxe4 { [%clk 0:00:54] } 6. Nc3 { [%clk 0:00:58] } Bc5 { [%clk 0:00:53] } 7. Re1 { [%clk 0:00:58] } O-O { [%clk 0:00:52] } 8. h3 { [%clk 0:00:56] } dxc3 { [%clk 0:00:50] } 0-1"
},
"Game2": {
"Event": "Rated Blitz game",
"Site": "https://lichess.org/kh6FJkyS",
"Date": "2020.05.01",
"Round": "-",
"White": "Quggai",
"Black": "vasiukov",
"Result": "1-0",
"UTCDate": "2020.05.01",
"UTCTime": "07:41:06",
"WhiteElo": "2292",
"BlackElo": "2210",
"WhiteRatingDiff": "+5",
"BlackRatingDiff": "-4",
"ECO": "C56",
"Opening": "Italian Game: Scotch Gambit, Nakhmanson Gambit",
"TimeControl": "180+0",
"Termination": "Normal",
"moves": "1. e4 { [%clk 0:03:00] } e5 { [%clk 0:03:00] } 2. Bc4 { [%clk 0:02:59] } Nf6 { [%clk 0:02:58] } 3. d4 { [%clk 0:02:58] } Nc6 { [%clk 0:02:51] } 4. Nf3 { [%clk 0:02:56] } exd4 { [%clk 0:02:48] } 5. O-O { [%clk 0:02:55] } Nxe4 { [%clk 0:02:46] } 6. Nc3 { [%clk 0:02:54] } dxc3 { [%clk 0:02:43] } 7. Bxf7+ { [%clk 0:02:48] } Kxf7 { [%clk 0:02:41] } 8. Qd5+ { [%clk 0:02:48] } Ke8 { [%clk 0:02:16] } 9. Re1 { [%clk 0:02:47] } Ne7 { [%clk 0:02:13] } 10. Rxe4 { [%clk 0:02:45] } c6 { [%clk 0:02:08] } 11. Qd6 { [%clk 0:02:30] } h6 { [%clk 0:01:32] } 12. Qg6# { [%clk 0:02:18] } 1-0"
},
...
谢谢!
答案 0 :(得分:1)
此JSON的结构看起来很容易使用默认的dataframe构造函数将其转换为dataframe:
df = pd.DataFrame(all_rows)
您还可以使用.transform (or just .T)函数来更改数据方向:
df = pd.DataFrame(all_rows).T
如果此JSON中有很多嵌套对象,则可以使用json_normalize从字典值创建数据框,然后设置正确的索引:
df = pd.json_normalize(list(all_rows.values()))
df.index = all_rows.keys()