TypeError:字符串索引必须是整数Python

时间:2015-08-11 22:40:24

标签: python google-maps for-loop

我正在尝试从Google API地理编码服务创建一个解析了地理编码数据信息的.csv文件。我想将地址信息解析为单独的列。我的脚本运行正常,直到我得到Type错误的位置数据。任何人都可以帮助我并修改我的脚本,以便我可以将这些数据包含在我的表中吗?

这是我的剧本

import pandas as pd 
import requests 
import geocoder
import time 
import json 
df = pd.read_csv('/Users/albertgonzalobautista/Desktop/workingbook.csv') # define CSV to be read to be geocdoed 

# create new columns for the output CSV 
df['geocode_data'] = ''
df['address']=''
df['street_number']=''
df['street_name']=''
df['postalcode']=''
df['city']=''
df['st_pr_mn']=''
df['country']=''
df['location_lat']=''
df['location_lon']=''

# Create function that handles the geocoding requests 

average = 0
def reverseGeocode(latlng): #defines reverse geocoding function 
    #Set parameters
    start = time.time()
    result = {} #create empty list 
    url = 'https://maps.googleapis.com/maps/api/geocode/json?latlng={0}&key={1}' #Access URL for Google Geocoder API 
    apikey = 'XXX' # Set you API Key taken from Google API website and your Google Developers Account 
    request = url.format(latlng, apikey) 
    #delays responses so that it does not over     
    data = json.loads(requests.get(request).text)
    if len(data['results']) > 0:
        result = data['results'][0]
    #global average #if not work delete first char(uncomment)
    average = time.time() - start

    return  result

for i, row in df.iterrows():
    if average < 0.3 : time.sleep(0.3 - average) #0.3 is period time (min= 0.2 max = free)

    df['geocode_data'][i] = reverseGeocode(df['lat'][i].astype(str) + ',' + df['lon'][i].astype(str))


for i, row in df.iterrows():
if 'address_components' in row['geocode_data']:
    for component in row['geocode_data']['address_components']:
        df['address'][i] = row['geocode_data']['formatted_address']

    for component in row['geocode_data']['address_components']:
        if 'street_number' in component['types']:
            df['street_number'][i] = component['long_name']

    for component in row['geocode_data']['address_components']:
        if 'route' in component ['types']:
            df['street_name'][i] = component['long_name']
            break
    for component in row['geocode_data']['address_components']:
        if 'route' in component ['types']:
            df['street_name'][i] = component['long_name'] 

    for component in row['geocode_data']['address_components']:
        if 'postal_code' in component ['types']:
            df['postalcode'][i] = component['short_name']
            break

    for component in row['geocode_data']['address_components']:
        if 'locality' in component ['types']:
            df['city'][i]= component['short_name']
            break

    for component in row['geocode_data']['address_components']:
        if 'administrative_area_level_1' in component ['types']:
            df['st_pr_mn'][i] = component ['long_name']  
            break

    for component in row['geocode_data']['address_components']:
        if 'country' in component ['types']:
            df['country'][i] = component ['long_name']
            break

    for component in row['geocode_data']['geometry']:
      if component['location']:
            df['location_lng'][i] = int(component['location']['lng'])
            df['location_lat'][i] = int(component['location']['lat'])

df.to_csv('test10.csv', encoding='utf-8', index=False)

以下是我在地理数据名称

中获得的Google数据示例

{&#39;几何&#39;:{&#39;视口&#39;:{&#39;西南&#39;:{&#39; lng&#39;:4.947849719708499,&#39; lat& #39;:52.36571761970851},&#39;东北&#39;:{&#39; lng&#39;:4.950547680291502,&#39; lat&#39;:52.3684155802915}},&#39; location&#39;: {&#39; lng&#39;:4.9491987,&#39; lat&#39;:52.3670666},&#39; location_type&#39;:&#39; ROOFTOP&#39;},&#39; address_components&#39 ;:[{&#39; long_name&#39;:&#39; 114&#39;,&#39; types&#39;:[&#39; street_number&#39;],&#39; short_name&#39; :&#39; 114&#39;},{&#39; long_name&#39;:&#39; Zeeburgerpad&#39;,&#39; types&#39;:[&#39; route&#39;], &#39; short_name&#39;:&#39; Zeeburgerpad&#39;},{&#39; long_name&#39;:&#39; Amsterdam-Oost&#39;,&#39; types&#39;:[ &#39; sublocality_level_1&#39;,&#39; sublocality&#39;,&#39; political&#39;],&#39; short_name&#39;:&#39; Amsterdam-Oost&#39;},{ &#39; long_name&#39;:&#39;阿姆斯特丹&#39;,&#39;类型&#39;:[&#39;地点&#39;,&#39;政治&#39;],&#39; ; short_name&#39;:&#39;阿姆斯特丹&#39;},{&#39; long_name&#39;:&#39; Amsterda m&#39;,&#39;类型&#39;:[&#39; administrative_area_level_2&#39;,&#39;政治&#39;],&#39; short_name&#39;:&#39;阿姆斯特丹&#39; ;},{&#39; long_name&#39;:&#39; Noord-Holland&#39;,&#39; types&#39;:[&#39; administrative_area_level_1&#39;,&#39; political&#39; ;],&#39; short_name&#39;:&#39; NH&#39;},{&#39; long_name&#39;:&#39;荷兰&#39;&#39;类型&#39;: [&#39; country&#39;,&#39; political&#39;],&#39; short_name&#39;:&#39; NL&#39;},{&#39; long_name&#39;:& #39; 1019 AE&#39;,&#39;类型&#39;:[&#39; postal_code&#39;],&#39; short_name&#39;:&#39; 1019 AE&#39;}], &#39; place_id&#39;:&#39; ChIJD14pyz8JxkcRF1Kpg8opql4&#39;,&#39; formatted_address&#39;:&#39; Zeeburgerpad 114,1019 AE Amsterdam,Netherlands&#39;,&#39; types&#39 ;:[&#39; street_address&#39;]}

2 个答案:

答案 0 :(得分:0)

在您的代码中发现可能的拼写错误:

public ActionResult Save(EditableBaseModel editableBaseModel) {
    var baseModel = new BaseModel() {
        Content = editableBaseModel.Content
        // ...
    };
    myDbContextInstance.BaseModels.Add(baseModel);
}

我相信你想在那里寻找组件['lng']。

如果这不能解决问题,你可以打印出df ['location_lng'],这样我们就可以确切地看到它是什么类型了吗?

答案 1 :(得分:0)

您的“位置”路径似乎不正确。试试这个:

for component in row['geocode_data']['geometry']:
    if component['location']: # Note: NOT: if 'location' in component ['...']
        df['location_lng'][i] = component['location']['lng'] 
        df['location_lat'][i] = component['location']['lat'] 

更新: 看来实际问题是输出'数组'正在使用。 您正在分配以下输出:

df['location_lat']=''

然后你分配:

df['location_lat'][i]=...

实际上是:

''[i] = ...

将导致注明Type错误。 [我得到“KeyError:'location_lat'”,但是嘿......]

也许尝试类似的事情:

results = []
...
for ...
    df = {}
    ...
    df['location_lng'] = component['location']['lng'] 
    ...
    results.append(df)

更新更新: 没有使用过Pandas,我不确定如何反映df csv格式;不过,我将推测,如果您删除“为输出CSV创建新列”部分,您的DataFrame结构将不会被破坏。 即:删除:

df['geocode_data'] = ''
df['address']=''

我可能会在错误的树上吠叫。

另一个更新: 我的简单测试用例是:

js = ... (json string as above)
data = json.loads(js)

component = data['geometry']
if component['location']:
    val = component['location']['lat']
    print val

适用于提取纬度。 所以问题不应该是'if'部分。

更新(再次......) 好的 - 再试一次:而不是:

df['geocode_data'][i] = reverseGeocode(...)

直接分配给字典变量进行数据提取。即:

data = reverseGeocode(...)

然后如上所述提取数据并根据需要分配给您的DataFrame。