如何将信息获取到数据框?

时间:2019-08-22 12:41:11

标签: python json dataframe

我正在尝试将此信息保存到数据框,但得到一个空的数据框。这是我从API获得的信息的示例:

[[{'$type': 'Tfl.Api.Presentation.Entities.Line, Tfl.Api.Presentation.Entities',
   'id': '1',
   'name': '1',
   'modeName': 'bus',
   'disruptions': [],
   'created': '2019-08-20T16:25:25.377Z',
   'modified': '2019-08-20T16:25:25.377Z',
   'lineStatuses': [],
   'routeSections': [{'$type': 'Tfl.Api.Presentation.Entities.MatchedRoute, Tfl.Api.Presentation.Entities',
     'name': 'New Oxford Street - Canada Water Bus Station',
     'direction': 'outbound',
     'originationName': 'New Oxford Street',
     'destinationName': 'Canada Water Bus Station',
     'originator': '490000235Z',
     'destination': '490004733D',
     'serviceType': 'Regular',
     'validTo': '2019-12-23T00:00:00Z',
     'validFrom': '2019-08-17T00:00:00Z'},
    {'$type': 'Tfl.Api.Presentation.Entities.MatchedRoute, Tfl.Api.Presentation.Entities',
     'name': 'Canada Water Bus Station - Tottenham Court Road',
     'direction': 'inbound',
     'originationName': 'Canada Water Bus Station',
     'destinationName': 'Tottenham Court Road',
     'originator': '490004733C',
     'destination': '490000235N',
     'serviceType': 'Regular',
     'validTo': '2019-12-23T00:00:00Z',
     'validFrom': '2019-08-17T00:00:00Z'}],
   'serviceTypes': [{'$type': 'Tfl.Api.Presentation.Entities.LineServiceTypeInfo, Tfl.Api.Presentation.Entities',
     'name': 'Regular',
     'uri': '/Line/Route?ids=1&serviceTypes=Regular'}],
   'crowding': {'$type': 'Tfl.Api.Presentation.Entities.Crowding, Tfl.Api.Presentation.Entities'}},
  {'$type': 'Tfl.Api.Presentation.Entities.Line, Tfl.Api.Presentation.Entities',
   'id': '100',
   'name': '100',
   'modeName': 'bus',
   'disruptions': [],
   'created': '2019-08-20T16:25:25.367Z',
   'modified': '2019-08-20T16:25:25.367Z',
   'lineStatuses': [],
   'routeSections': [{'$type': 'Tfl.Api.Presentation.Entities.MatchedRoute, Tfl.Api.Presentation.Entities',
     'name': "King Edward Street / St Pauls Station - St George's Town Hall / Shadwell Stn",
     'direction': 'outbound',
     'originationName': 'King Edward Street / St Pauls Station',
     'destinationName': "St George's Town Hall / Shadwell Stn",
     'originator': '490008743N',
     'destination': '490012020A',
     'serviceType': 'Regular',
     'validTo': '2019-12-23T00:00:00Z',
     'validFrom': '2019-08-17T00:00:00Z'}

我需要在数据框中获取此信息,所以我尝试了以下代码:

info2 = np.squeeze(info2).tolist()
dftypes = pd.DataFrame(columns = ["id", "name", "modeName", "routeSections"])
dfroutes=pd.DataFrame(columns =["$type","name","direction","originationName","destinationName","serviceType"])
i=0
j=0
for dic in info2:
    for key in dic:
        if key in dftypes.columns.tolist():
            dftypes.loc[i,key]=str(dic[key])

        if key=='routeSections':
            for dic2 in dic[key]:
                for key2 in dic2:
                    if key2 in dfroutes.columns.tolist():
                         dfroutes.loc[j,key2]=str(dic2[key2])
                j+=1
    i+=1

dftypes

,我得到一个空的数据框。我想从routeSections,名称,modeName等获取所有信息。

您能告诉我正确的方法吗?

1 个答案:

答案 0 :(得分:0)

最好不要直接对TfL API使用pd.read_json,而不是阅读,纠缠然后尝试转换为Pandas。例如:

import pandas as pd
x = pd.read_json('https://api.tfl.gov.uk/StopPoint/490006192S/arrivals')

给我:

$type                  6 non-null object
id                     6 non-null int64
operationType          6 non-null int64
vehicleId              6 non-null object
naptanId               6 non-null object
stationName            6 non-null object
lineId                 6 non-null object
lineName               6 non-null object
platformName           6 non-null object
direction              6 non-null object
bearing                6 non-null int64
destinationNaptanId    6 non-null object
destinationName        6 non-null object
timestamp              6 non-null datetime64[ns, UTC]
timeToStation          6 non-null int64
currentLocation        6 non-null object
towards                6 non-null object
expectedArrival        6 non-null object
timeToLive             6 non-null object
modeName               6 non-null object
timing                 6 non-null object
dtypes: datetime64[ns, UTC](1), int64(4), object(16)

然后,您可以直接在Pandas中操作数据。