在Python中将嵌套的JSON转换为数据帧

时间:2019-02-03 09:10:01

标签: json python-3.x pandas

我请求了一个网址,以从网站获取json响应。响应为字典列表的格式,其中某些元素包含另一个字典列表。我尝试了json_normalize,但仅取出了一层,并且无法在dict作为数据框的列。如果您能提出任何建议,不胜感激。

以下是列表元素数据之一:

[{'matchID': '0b0943b1-5673-4408-bca4-c34e63a11cfc', 'matchIDinofficial': '20190202SAT77', 'matchNum': '77', 'matchDate': '2019-02-02+08:00', 'matchDay': 'SAT', 'coupon': {'couponID': '1', 'couponShortName': 'SAT', 'couponNameCH': '周六賽事', 'couponNameEN': 'Saturday Matches'}, 'league': {'leagueID': '124', 'leagueShortName': 'MXL', 'leagueNameCH': '墨西哥超級聯賽', 'leagueNameEN': 'Mexican Premier'}, 'homeTeam': {'teamID': '2041', 'teamNameCH': '迪祖亞拿', 'teamNameEN': 'Tijuana'}, 'awayTeam': {'teamID': '910', 'teamNameCH': '托盧卡', 'teamNameEN': 'Toluca'}, 'matchStatus': 'ResultIn', 'matchTime': '2019-02-03T11:06:00+08:00', 'statuslastupdated': '2019-02-03T11:56:05+08:00', 'inplaydelay': 'false', 'liveEvent': {'ilcLiveDisplay': True, 'hasLiveInfo': True, 'isIncomplete': False, 'matchIDbetradar': '16560915', 'matchstate': 'HalfTime', 'stateTS': '2019-02-03T11:07:44+08:00', 'liveevent': [{'order': 1, 'minutesElasped': '14', 'actionType': 'Regular', 'playerNameCH': '米拿保蘭奴斯', 'playerNameEN': 'Miller Bolanos', 'homeaway': 'Home'}, {'order': 2, 'minutesElasped': '25', 'actionType': 'YellowCard', 'playerNameCH': '菲臘比柏度', 'playerNameEN': 'Felipe Pardo', 'homeaway': 'Away'}]}, 'accumulatedscore': [{'periodvalue': 'FirstHalf', 'periodstatus': 'ResultFinal', 'home': '1', 'away': '0'}], 'livescore': {'home': '1', 'away': '0'}, 'cornerresult': '5', 'hasWebTV': False, 'hilodds': {'LINELIST': [{'LINENUM': '2', 'MAINLINE': 'false', 'LINESTATUS': '1', 'LINEORDER': '2', 'LINE': '3.5/3.5', 'L': '100@1.22', 'H': '100@3.80'}, {'LINENUM': '1', 'MAINLINE': 'true', 'LINESTATUS': '1', 'LINEORDER': '1', 'LINE': '2.5/2.5', 'H': '100@1.95', 'L': '100@1.75'}, {'LINENUM': '3', 'MAINLINE': 'false', 'LINESTATUS': '1', 'LINEORDER': '3', 'LINE': '2.0/2.5', 'L': '100@2.10', 'H': '100@1.65'}], 'ID': '0adfff89-9b63-4771-9008-96762227aca6', 'POOLSTATUS': 'Selling', 'INPLAY': 'true', 'ALLUP': 'true', 'Cur': '1'}, 'hasExtraTimePools': False, 'results': {}, 'definedPools': ['HAD', 'FHA', 'CRS', 'FCS', 'FTS', 'OOE', 'TTG', 'HFT', 'HHA', 'HDC', 'HIL', 'FHL', 'CHL', 'NTS'], 'inplayPools': ['HAD', 'HIL', 'CHL', 'CRS', 'NTS']}]
import requests
import pandas as pd
import from pandas.io.json import json_normalize
url = 'url'
response = requests.get(url).json()
newdf = pd.DataFrame()
for match in response:
    df = json_normalize(match)
    newdf = newdf.join(df)

它给我值错误,如下所示:

ValueError: columns overlap but no suffix specified:
Index(['awayTeam.teamID', 'awayTeam.teamNameCH', 'awayTeam.teamNameEN',
       'cornerresult', 'coupon.couponID', 'coupon.couponNameCH',
       'coupon.couponNameEN', 'coupon.couponShortName', 'definedPools',
       'hasExtraTimePools', 'hasWebTV', 'hilodds.ALLUP', 'hilodds.Cur',
       'hilodds.ID', 'hilodds.INPLAY', 'hilodds.LINELIST',
       'hilodds.POOLSTATUS', 'homeTeam.teamID', 'homeTeam.teamNameCH',
       'homeTeam.teamNameEN', 'inplayPools', 'inplaydelay', 'league.leagueID',
       'league.leagueNameCH', 'league.leagueNameEN', 'league.leagueShortName',
       'liveEvent.hasLiveInfo', 'liveEvent.ilcLiveDisplay',
       'liveEvent.isIncomplete', 'liveEvent.liveevent',
       'liveEvent.matchIDbetradar', 'liveEvent.matchstate',
       'liveEvent.stateTS', 'matchDate', 'matchDay', 'matchID',
       'matchIDinofficial', 'matchNum', 'matchStatus', 'matchTime',
       'statuslastupdated'],
      dtype='object')

我希望数据框中的列是这样的:

matchID homeTeam.teamNameEN awayTeam.teamNameEN hilodds.LINELIST.LINENUM

以上只是一个小例子,我希望字典列表中的所有键成为数据框中的列标题

0 个答案:

没有答案