[
{
"match_hometeam_score": "2 ",
"match_awayteam_score": " 0",
"statistics": [
{
"type": "Ball Possession",
"home": "70%",
"away": "30%"
},
{
"type": "Goal Attempts",
"home": "6",
"away": "3"
},
{
"type": "Shots on Goal",
"home": "4",
"away": "1"
},
{
"type": "Shots off Goal",
"home": "1",
"away": "2"
},
{
"type": "Blocked Shots",
"home": "1",
"away": "0"
},
{
"type": "Free Kicks",
"home": "10",
"away": "12"
},
{
"type": "Corner Kicks",
"home": "5",
"away": "2"
},
{
"type": "Offsides",
"home": "2",
"away": "1"
},
{
"type": "Goalkeeper Saves",
"home": "1",
"away": "2"
},
{
"type": "Fouls",
"home": "11",
"away": "9"
},
{
"type": "Yellow Cards",
"home": "2",
"away": "0"
},
{
"type": "Total Passes",
"home": "657",
"away": "272"
},
{
"type": "Tackles",
"home": "11",
"away": "18"
}
]
},
.....
]
Here是我得到的json文件的一小段示例代码。我想通过提取“统计”列中的值来使其变平。
我尝试了
flat_matches = pd.concat([all_matches.drop(['statistics'],axis=1),all_matches['statistics'].apply(pd.Series)], axis=1)
它以某种方式工作,但不如我所希望的那样。我想用列创建新的df;
CSV代码如下;
,match_hometeam_score,match_awayteam_score,statistics 0,3,1,"[{'type': 'Ball Possession', 'home': '44%', 'away': '56%'}, {'type': 'Goal Attempts', 'home': '15', 'away': '6'}, {'type': 'Shots on Goal', 'home': '5', 'away': '5'}, {'type': 'Shots off Goal', 'home': '9', 'away': '1'}, {'type': 'Blocked Shots', 'home': '1', 'away': '0'}, {'type': 'Corner Kicks', 'home': '3', 'away': '3'}, {'type': 'Offsides', 'home': '4', 'away': '2'}, {'type': 'Goalkeeper Saves', 'home': '4', 'away': '2'}, {'type': 'Fouls', 'home': '11', 'away': '10'}, {'type': 'Yellow Cards', 'home': '2', 'away': '4'}, {'type': 'Total Passes', 'home': '382', 'away': '503'}, {'type': 'Tackles', 'home': '13', 'away': '16'}, {'type': 'Attacks', 'home': '97', 'away': '136'}, {'type': 'Dangerous Attacks', 'home': '45', 'away': '63'}]" 1,1,2,"[{'type': 'Ball Possession', 'home': '61%', 'away': '39%'}, {'type': 'Goal Attempts', 'home': '22', 'away': '12'}, {'type': 'Shots on Goal', 'home': '10', 'away': '7'}, {'type': 'Shots off Goal', 'home': '6', 'away': '3'}, {'type': 'Blocked Shots', 'home': '6', 'away': '2'}, {'type': 'Corner Kicks', 'home': '7', 'away': '2'}, {'type': 'Offsides', 'home': '0', 'away': '2'}, {'type': 'Goalkeeper Saves', 'home': '5', 'away': '9'}, {'type': 'Fouls', 'home': '12', 'away': '13'}, {'type': 'Yellow Cards', 'home': '4', 'away': '4'}, {'type': 'Total Passes', 'home': '421', 'away': '271'}, {'type': 'Tackles', 'home': '14', 'away': '24'}, {'type': 'Attacks', 'home': '97', 'away': '86'}, {'type': 'Dangerous Attacks', 'home': '43', 'away': '46'}]" 2,1,2,"[{'type': 'Ball Possession', 'home': '48%', 'away': '52%'}, {'type': 'Goal Attempts', 'home': '16', 'away': '14'}, {'type': 'Shots on Goal', 'home': '4', 'away': '6'}, {'type': 'Shots off Goal', 'home': '6', 'away': '5'}, {'type': 'Blocked Shots', 'home': '6', 'away': '3'}, {'type': 'Corner Kicks', 'home': '4', 'away': '4'}, {'type': 'Offsides', 'home': '2', 'away': '6'}, {'type': 'Goalkeeper Saves', 'home': '4', 'away': '3'}, {'type': 'Fouls', 'home': '11', 'away': '14'}, {'type': 'Yellow Cards', 'home': '2', 'away': '7'}, {'type': 'Total Passes', 'home': '594', 'away': '643'}, {'type': 'Tackles', 'home': '24', 'away': '16'}, {'type': 'Attacks', 'home': '144', 'away': '130'}, {'type': 'Dangerous Attacks', 'home': '77', 'away': '36'}]"
非常感谢您的各种帮助!请告诉我如何将这个json数据集展平到同一级别。我是新手爱好者。如果我可以改善问题的质量,请随时给我提示。
答案 0 :(得分:0)
下面显示了如何转换数据框中的给定行。您需要遍历并创建如下所示的数据框。
import json
import pandas as pd
sample_row = [{'type': 'Ball Possession', 'home': '44%', 'away': '56%'}, {'type': 'Goal Attempts', 'home': '15', 'away': '6'}, {'type': 'Shots on Goal', 'home': '5', 'away': '5'}, {'type': 'Shots off Goal', 'home': '9', 'away': '1'}, {'type': 'Blocked Shots', 'home': '1', 'away': '0'}, {'type': 'Corner Kicks', 'home': '3', 'away': '3'}, {'type': 'Offsides', 'home': '4', 'away': '2'}, {'type': 'Goalkeeper Saves', 'home': '4', 'away': '2'}, {'type': 'Fouls', 'home': '11', 'away': '10'}, {'type': 'Yellow Cards', 'home': '2', 'away': '4'}, {'type': 'Total Passes', 'home': '382', 'away': '503'}, {'type': 'Tackles', 'home': '13', 'away': '16'}, {'type': 'Attacks', 'home': '97', 'away': '136'}, {'type': 'Dangerous Attacks', 'home': '45', 'away': '63'}]
js = json.dumps(sample_row)
df = pd.json_normalize(json.loads(js))
df['match_hometeam_score'] = [3] * len(df)
df['match_awayteam_score'] = [1] * len(df)