使用python和google api从反向地理编码数据中解析所选信息

时间:2015-08-12 05:02:46

标签: python json google-maps parsing

所以我一直在解析通过谷歌api获得的JSON数据中的信息。我试图解析地址信息,我成功地做了大部分地址信息,除了字段“位置”,每次运行它都给我这个错误TypeError:字符串索引必须是整数之后。

这是data = json.loads(requests.get(request).text)的打印输出

{'results': [{'address_components': [{'long_name': 'De Ruijterkade', 'short_name': 'De Ruijterkade', 'types': ['route']}, {'long_name': 'Burgwallen Nieuwe Zijde', 'short_name': 'Burgwallen Nieuwe Zijde', 'types': ['sublocality_level_2', 'sublocality', 'political']}, {'long_name': 'Centrum', 'short_name': 'Centrum', 'types': ['sublocality_level_1', 'sublocality', 'political']}, {'long_name': 'Amsterdam', 'short_name': 'Amsterdam', 'types': ['locality', 'political']}, {'long_name': 'Amsterdam', 'short_name': 'Amsterdam', 'types': ['administrative_area_level_2', 'political']}, {'long_name': 'Noord-Holland', 'short_name': 'NH', 'types': ['administrative_area_level_1', 'political']}, {'long_name': 'Netherlands', 'short_name': 'NL', 'types': ['country', 'political']}, {'long_name': '1012', 'short_name': '1012', 'types': ['postal_code_prefix', 'postal_code']}], 'formatted_address': 'De Ruijterkade, 1012 Amsterdam, Netherlands', 'geometry': {'location': {'lat': 52.3802983, 'lng': 4.8998297}, 'location_type': 'GEOMETRIC_CENTER', 'bounds': {'northeast': {'lat': 52.3804703, 'lng': 4.9003149}, 'southwest': {'lat': 52.3801263, 'lng': 4.899344399999999}}, 'viewport': {'northeast': {'lat': 52.3816472802915, 'lng': 4.901178630291502}, 'southwest': {'lat': 52.3789493197085, 'lng': 4.898480669708498}}}, 'types': ['route'], 'place_id': 'ChIJK1QpUbYJxkcRjXLTywmZwJI'}, {'address_components': [{'long_name': 'Amsterdam, Veer Centraal Station', 'short_name': 'Amsterdam, Veer Centraal Station', 'types': ['point_of_interest', 'establishment']}, {'long_name': 'Centrum', 'short_name': 'Centrum', 'types': ['sublocality_level_1', 'sublocality', 'political']}, {'long_name': 'Amsterdam', 'short_name': 'Amsterdam', 'types': ['locality', 'political']}, {'long_name': 'Amsterdam', 'short_name': 'Amsterdam', 'types': ['administrative_area_level_2', 'political']}, {'long_name': 'Noord-Holland', 'short_name': 'NH', 'types': ['administrative_area_level_1', 'political']}, {'long_name': 'Netherlands', 'short_name': 'NL', 'types': ['country', 'political']}, {'long_name': '1011', 'short_name': '1011', 'types': ['postal_code_prefix', 'postal_code']}], 'formatted_address': 'Amsterdam, Veer Centraal Station, 1011 Amsterdam, Netherlands', 'geometry': {'location': {'lat': 52.380707, 'lng': 4.899557}, 'location_type': 'APPROXIMATE', 'viewport': {'northeast': {'lat': 52.3820559802915, 'lng': 4.900905980291502}, 'southwest': {'lat': 52.3793580197085, 'lng': 4.898208019708497}}}, 'types': ['transit_station', 'point_of_interest', 'establishment'], 'place_id': 'ChIJMUfcWrYJxkcREV6IQJl5jEw'}, {'address_components': [{'long_name': 'Burgwallen Nieuwe Zijde', 'short_name': 'Burgwallen Nieuwe Zijde', 'types': ['sublocality_level_2', 'sublocality', 'political']}, {'long_name': 'Centrum', 'short_name': 'Centrum', 'types': ['sublocality_level_1', 'sublocality', 'political']}, {'long_name': 'Amsterdam', 'short_name': 'Amsterdam', 'types': ['locality', 'political']}, {'long_name': 'Government of Amsterdam', 'short_name': 'Government of Amsterdam', 'types': ['administrative_area_level_2', 'political']}, {'long_name': 'North Holland', 'short_name': 'NH', 'types': ['administrative_area_level_1', 'political']}, {'long_name': 'Netherlands', 'short_name': 'NL', 'types': ['country', 'political']}], 'formatted_address': 'Burgwallen Nieuwe Zijde, Amsterdam, Netherlands', 'geometry': {'location': {'lat': 52.37087409999999, 'lng': 4.890298}, 'location_type': 'APPROXIMATE', 'bounds': {'northeast': {'lat': 52.38220399999999, 'lng': 4.906104}, 'southwest': {'lat': 52.366919, 'lng': 4.887786}}, 'viewport': {'northeast': {'lat': 52.38220399999999, 'lng': 4.906104}, 'southwest': {'lat': 52.366919, 'lng': 4.887786}}}, 'types': ['sublocality_level_2', 'sublocality', 'political'], 'place_id': 'ChIJfUZZVbgJxkcR_FWOF0TYX2o'}, {'address_components': [{'long_name': 'Centrum', 'short_name': 'Centrum', 'types': ['sublocality_level_1', 'sublocality', 'political']}, {'long_name': 'Amsterdam', 'short_name': 'Amsterdam', 'types': ['locality', 'political']}, {'long_name': 'Amsterdam', 'short_name': 'Amsterdam', 'types': ['administrative_area_level_2', 'political']}, {'long_name': 'Noord-Holland', 'short_name': 'NH', 'types': ['administrative_area_level_1', 'political']}, {'long_name': 'Netherlands', 'short_name': 'NL', 'types': ['country', 'political']}], 'formatted_address': 'Centrum, Amsterdam, Netherlands', 'geometry': {'location': {'lat': 52.3717204, 'lng': 4.902072700000001}, 'location_type': 'APPROXIMATE', 'bounds': {'northeast': {'lat': 52.388803, 'lng': 4.9333851}, 'southwest': {'lat': 52.35795, 'lng': 4.874455}}, 'viewport': {'northeast': {'lat': 52.388803, 'lng': 4.9333851}, 'southwest': {'lat': 52.35795, 'lng': 4.874455}}}, 'types': ['sublocality_level_1', 'sublocality', 'political'], 'place_id': 'ChIJ0-khaLsJxkcRm73OQ7Ahm9I'}, {'address_components': [{'long_name': 'Amsterdam', 'short_name': 'Amsterdam', 'types': ['locality', 'political']}, {'long_name': 'Government of Amsterdam', 'short_name': 'Government of Amsterdam', 'types': ['administrative_area_level_2', 'political']}, {'long_name': 'North Holland', 'short_name': 'NH', 'types': ['administrative_area_level_1', 'political']}, {'long_name': 'Netherlands', 'short_name': 'NL', 'types': ['country', 'political']}], 'formatted_address': 'Amsterdam, Netherlands', 'geometry': {'location': {'lat': 52.3702157, 'lng': 4.895167900000001}, 'location_type': 'APPROXIMATE', 'bounds': {'northeast': {'lat': 52.4311573, 'lng': 5.068372600000001}, 'southwest': {'lat': 52.3182742, 'lng': 4.7288558}}, 'viewport': {'northeast': {'lat': 52.4311573, 'lng': 5.068372600000001}, 'southwest': {'lat': 52.3182742, 'lng': 4.7288558}}}, 'types': ['locality', 'political'], 'place_id': 'ChIJVXealLU_xkcRja_At0z9AGY'}, {'address_components': [{'long_name': '1012 PL', 'short_name': '1012 PL', 'types': ['postal_code']}, {'long_name': 'Centrum', 'short_name': 'Centrum', 'types': ['sublocality_level_1', 'sublocality', 'political']}, {'long_name': 'Amsterdam', 'short_name': 'Amsterdam', 'types': ['locality', 'political']}, {'long_name': 'Amsterdam', 'short_name': 'Amsterdam', 'types': ['administrative_area_level_2', 'political']}, {'long_name': 'Noord-Holland', 'short_name': 'NH', 'types': ['administrative_area_level_1', 'political']}, {'long_name': 'Netherlands', 'short_name': 'NL', 'types': ['country', 'political']}], 'formatted_address': '1012 PL Amsterdam, Netherlands', 'geometry': {'location': {'lat': 52.3720873, 'lng': 4.8913521}, 'location_type': 'APPROXIMATE', 'bounds': {'northeast': {'lat': 52.3811957, 'lng': 4.9004189}, 'southwest': {'lat': 52.3708059, 'lng': 4.8902981}}, 'viewport': {'northeast': {'lat': 52.3735228, 'lng': 4.892705580291502}, 'southwest': {'lat': 52.3708059, 'lng': 4.890007619708498}}}, 'types': ['postal_code'], 'place_id': 'ChIJ4_rV1MYJxkcRacayFoYUfvQ'}, {'address_components': [{'long_name': '1011 AA', 'short_name': '1011 AA', 'types': ['postal_code']}, {'long_name': 'Amsterdam', 'short_name': 'Amsterdam', 'types': ['locality', 'political']}, {'long_name': 'Government of Amsterdam', 'short_name': 'Government of Amsterdam', 'types': ['administrative_area_level_2', 'political']}, {'long_name': 'North Holland', 'short_name': 'NH', 'types': ['administrative_area_level_1', 'political']}, {'long_name': 'Netherlands', 'short_name': 'NL', 'types': ['country', 'political']}], 'formatted_address': '1011 AA Amsterdam, Netherlands', 'geometry': {'location': {'lat': 52.3811146, 'lng': 4.8987269}, 'location_type': 'APPROXIMATE', 'bounds': {'northeast': {'lat': 52.3826567, 'lng': 4.9034822}, 'southwest': {'lat': 52.3663554, 'lng': 4.8952368}}, 'viewport': {'northeast': {'lat': 52.3826567, 'lng': 4.9034822}, 'southwest': {'lat': 52.3783419, 'lng': 4.8952368}}}, 'types': ['postal_code'], 'place_id': 'ChIJp_CpGLcJxkcRg83v2NdeZlo'}, {'address_components': [{'long_name': '1013 AA', 'short_name': '1013 AA', 'types': ['postal_code']}, {'long_name': 'Amsterdam', 'short_name': 'Amsterdam', 'types': ['locality', 'political']}, {'long_name': 'North Holland', 'short_name': 'NH', 'types': ['administrative_area_level_1', 'political']}, {'long_name': 'Netherlands', 'short_name': 'NL', 'types': ['country', 'political']}], 'formatted_address': '1013 AA Amsterdam, Netherlands', 'geometry': {'location': {'lat': 52.382054, 'lng': 4.895514700000001}, 'location_type': 'APPROXIMATE', 'bounds': {'northeast': {'lat': 52.3841178, 'lng': 4.9004189}, 'southwest': {'lat': 52.2464631, 'lng': 4.8353082}}, 'viewport': {'northeast': {'lat': 52.3841178, 'lng': 4.9004189}, 'southwest': {'lat': 52.3798028, 'lng': 4.8921732}}}, 'types': ['postal_code'], 'place_id': 'ChIJvZ3ZhckJxkcRyLrNcVreABM'}, {'address_components': [{'long_name': '1012 AB', 'short_name': '1012 AB', 'types': ['postal_code']}, {'long_name': 'Amsterdam', 'short_name': 'Amsterdam', 'types': ['locality', 'political']}, {'long_name': 'Netherlands', 'short_name': 'NL', 'types': ['country', 'political']}], 'formatted_address': '1012 AB Amsterdam, Netherlands', 'geometry': {'location': {'lat': 52.3783909, 'lng': 4.898624799999999}, 'location_type': 'APPROXIMATE', 'bounds': {'northeast': {'lat': 52.38187790000001, 'lng': 5.0284828}, 'southwest': {'lat': 51.5894329, 'lng': 4.765747300000001}}, 'viewport': {'northeast': {'lat': 52.3812639, 'lng': 4.9034822}, 'southwest': {'lat': 52.3770175, 'lng': 4.8950021}}}, 'types': ['postal_code'], 'place_id': 'ChIJ77dAmbcJxkcRF9litSRU4Jk'}, {'address_components': [{'long_name': '1011', 'short_name': '1011', 'types': ['postal_code_prefix', 'postal_code']}, {'long_name': 'Amsterdam', 'short_name': 'Amsterdam', 'types': ['locality', 'political']}, {'long_name': 'Government of Amsterdam', 'short_name': 'Government of Amsterdam', 'types': ['administrative_area_level_2', 'political']}, {'long_name': 'North Holland', 'short_name': 'NH', 'types': ['administrative_area_level_1', 'political']}, {'long_name': 'Netherlands', 'short_name': 'NL', 'types': ['country', 'political']}], 'formatted_address': '1011 Amsterdam, Netherlands', 'geometry': {'location': {'lat': 52.3729762, 'lng': 4.9039565}, 'location_type': 'APPROXIMATE', 'bounds': {'northeast': {'lat': 52.383664, 'lng': 4.9135433}, 'southwest': {'lat': 52.3657741, 'lng': 4.8932707}}, 'viewport': {'northeast': {'lat': 52.383664, 'lng': 4.9135433}, 'southwest': {'lat': 52.3657741, 'lng': 4.8932707}}}, 'types': ['postal_code_prefix', 'postal_code'], 'place_id': 'ChIJRYIbLroJxkcRp-AVrGq2xcw'}, {'address_components': [{'long_name': 'Government of Amsterdam', 'short_name': 'Government of Amsterdam', 'types': ['administrative_area_level_2', 'political']}, {'long_name': 'North Holland', 'short_name': 'NH', 'types': ['administrative_area_level_1', 'political']}, {'long_name': 'Netherlands', 'short_name': 'NL', 'types': ['country', 'political']}], 'formatted_address': 'Government of Amsterdam, Netherlands', 'geometry': {'location': {'lat': 52.3666969, 'lng': 4.8945398}, 'location_type': 'APPROXIMATE', 'bounds': {'northeast': {'lat': 52.4311573, 'lng': 5.068372600000001}, 'southwest': {'lat': 52.27783549999999, 'lng': 4.7287818}}, 'viewport': {'northeast': {'lat': 52.4311573, 'lng': 5.068372600000001}, 'southwest': {'lat': 52.27783549999999, 'lng': 4.7287818}}}, 'types': ['administrative_area_level_2', 'political'], 'place_id': 'ChIJVXealLU_xkcRRVd1SMEgTw4'}, {'address_components': [{'long_name': 'North Holland', 'short_name': 'NH', 'types': ['administrative_area_level_1', 'political']}, {'long_name': 'Netherlands', 'short_name': 'NL', 'types': ['country', 'political']}], 'formatted_address': 'North Holland, Netherlands', 'geometry': {'location': {'lat': 52.5205869, 'lng': 4.788474}, 'location_type': 'APPROXIMATE', 'bounds': {'northeast': {'lat': 53.1833322, 'lng': 5.328279999999999}, 'southwest': {'lat': 52.1657716, 'lng': 4.4937415}}, 'viewport': {'northeast': {'lat': 53.1833322, 'lng': 5.328279999999999}, 'southwest': {'lat': 52.1657716, 'lng': 4.4937415}}}, 'types': ['administrative_area_level_1', 'political'], 'place_id': 'ChIJu-SH28MJxkcRJYI2wf63IME'}, {'address_components': [{'long_name': 'Netherlands', 'short_name': 'NL', 'types': ['country', 'political']}], 'formatted_address': 'Netherlands', 'geometry': {'location': {'lat': 52.132633, 'lng': 5.291265999999999}, 'location_type': 'APPROXIMATE', 'bounds': {'northeast': {'lat': 53.5551999, 'lng': 7.227510199999999}, 'southwest': {'lat': 50.75038379999999, 'lng': 3.357962}}, 'viewport': {'northeast': {'lat': 53.6756, 'lng': 7.227140500000001}, 'southwest': {'lat': 50.7503837, 'lng': 3.3316}}}, 'types': ['country', 'political'], 'place_id': 'ChIJu-SH28MJxkcRnwq9_851obM'}], 'status': 'OK'}

我尝试打印出该字段的组件方面,然后在错误发生后打印出“位置”。我不知道该怎么做以及如何继续解析我需要的信息。任何人都可以提供答案谢谢!

这是我的脚本,部分是在下面的位置字段开始给出错误。

import pandas as pd 
import requests 
import geocoder

# preinstalled library does neet to be installed  
import time 
import json 
df = pd.read_csv('/Users/albertgonzalobautista/Desktop/workingbook.csv') # define CSV to be read to be geocdoed 
# create new columns for the output CSV 
df['geocode_data'] = ''
df['address']=''
df['street_number']=''
df['street_name']=''
df['postalcode']=''
df['city']=''
df['st_pr_mn']=''
df['country']=''
df['location_lat']=''
df['location_lon']=''

# Create function that handles the geocoding requests 

average = 0

    def reverseGeocode(latlng): #defines reverse geocoding function 
        #Set parameters
        start = time.time()
        result = {} #create empty list 
        url = 'https://maps.googleapis.com/maps/api/geocode/json?latlng={0}&key={1}' #Access URL for Google Geocoder API 
        apikey = 'AXXX' # Set you API Key taken from Google API website and your Google Developers Account 
        request = url.format(latlng, apikey) 
        #delays responses so that it does not over     
        data = json.loads(requests.get(request).text)
        if len(data['results']) > 0:
            result = data['results'][0]
        #global average #if not work delete first char(uncomment)
        average = time.time() - start

        return  result

    for i, row in df.iterrows():
        if average < 0.3 : time.sleep(0.3 - average) #0.3 is period time (min= 0.2 max = free)

        df['geocode_data'][i] = reverseGeocode(df['lat'][i].astype(str) + ',' + df['lon'][i].astype(str))


    for i, row in df.iterrows():
        if 'address_components' in row['geocode_data']:
            for component in row['geocode_data']['address_components']:
                df['address'][i] = row['geocode_data']['formatted_address']

            for component in row['geocode_data']['address_components']:
                if 'street_number' in component['types']:
                    df['street_number'][i] = component['long_name']

            for component in row['geocode_data']['address_components']:
                if 'route' in component ['types']:
                    df['street_name'][i] = component['long_name']
                    break
            for component in row['geocode_data']['address_components']:
                if 'route' in component ['types']:
                    df['street_name'][i] = component['long_name'] 

            for component in row['geocode_data']['address_components']:
                if 'postal_code' in component ['types']:
                    df['postalcode'][i] = component['short_name']
                    break

            for component in row['geocode_data']['address_components']:
                if 'locality' in component ['types']:
                    df['city'][i]= component['short_name']
                    break

            for component in row['geocode_data']['address_components']:
                if 'administrative_area_level_1' in component ['types']:
                    df['st_pr_mn'][i] = component ['long_name']  
                    break

            for component in row['geocode_data']['address_components']:
                if 'country' in component ['types']:
                    df['country'][i] = component ['long_name']
                    break

            for component in row['geocode_data']['geometry']:
              if component['location']:
                    df['location_lng'][i] = component['location']['lng']
                    df['location_lat'][i] = component['location']['lat']

    df.to_csv('test10.csv', encoding='utf-8', index=False)

以下是我想要获取的数据示例:

  

{'geometry':{'viewport':{'southwest':{'lng':4.947849719708499,   'lat':52.36571761970851},'northeast':{'lng':4.950547680291502,   'lat':52.3684155802915}},'location':{'lng':4.9491987,'lat':   52.3670666},'location_type':'ROOFTOP'},'address_components':[{'long_name':'114','types':['street_number'],'short_name':   '114'},{'long_name':'Zeeburgerpad','types':['route'],   'short_name':'Zeeburgerpad'},{'long_name':'Amsterdam-Oost',   'types':['sublocality_level_1','sublocality','political'],   'short_name':'Amsterdam-Oost'},{'long_name':'Amsterdam','types':   ['locality','political'],'short_name':'Amsterdam'},{'long_name':   '阿姆斯特丹','类型':['administrative_area_level_2','political'],   'short_name':'Amsterdam'},{'long_name':'Noord-Holland','types':   ['administrative_area_level_1','political'],'short_name':'NH'},   {'long_name':'荷兰','类型':['country','political'],   'short_name':'NL'},{'long_name':'1019 AE','types':   ['postal_code'],'short_name':'1019 AE'}],'place_id':   'chIJD14pyz8JxkcRF1Kpg8opql4','formatted_address':'Zeeburgerpad 114,   1019 AE Amsterdam,Netherlands','types':['street_address']}

1 个答案:

答案 0 :(得分:1)

问题是row['geocode_data']['geometry']会返回包含您的位置和内容的字典。因此,当您将字典迭代为 -

for component in row['geocode_data']['geometry']:
    if component['location']:
        df['location_lng'][i] = component['location']['lng']
        df['location_lat'][i] = component['location']['lat']

component是字典中的键(不是字典本身),它是一个字符串,因此当你执行 - component['location']时,它会导致错误。显示相同错误的一个非常简单的示例 -

>>> for i in d:
...     i['1']
...
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
TypeError: string indices must be integers
>>> for i in d:
...     print(i)
...
1

对于您的情况,您不需要遍历字典,只需要将字典设置为组件,如果组件中有location,则取其值,示例 -

component = row['geocode_data']['geometry']:
if 'location' in component:
    df['location_lng'][i] = component['location']['lng']
    df['location_lat'][i] = component['location']['lat']