我正在尝试从Google API地理编码服务创建一个解析了地理编码数据信息的.csv文件。我想将地址信息解析为单独的列。我的脚本运行正常,直到我得到Type错误的位置数据。任何人都可以帮助我并修改我的脚本,以便我可以将这些数据包含在我的表中吗?
import pandas as pd
import requests
import geocoder
import time
import json
df = pd.read_csv('/Users/albertgonzalobautista/Desktop/workingbook.csv') # define CSV to be read to be geocdoed
# create new columns for the output CSV
df['geocode_data'] = ''
df['address']=''
df['street_number']=''
df['street_name']=''
df['postalcode']=''
df['city']=''
df['st_pr_mn']=''
df['country']=''
df['location_lat']=''
df['location_lon']=''
# Create function that handles the geocoding requests
average = 0
def reverseGeocode(latlng): #defines reverse geocoding function
#Set parameters
start = time.time()
result = {} #create empty list
url = 'https://maps.googleapis.com/maps/api/geocode/json?latlng={0}&key={1}' #Access URL for Google Geocoder API
apikey = 'XXX' # Set you API Key taken from Google API website and your Google Developers Account
request = url.format(latlng, apikey)
#delays responses so that it does not over
data = json.loads(requests.get(request).text)
if len(data['results']) > 0:
result = data['results'][0]
#global average #if not work delete first char(uncomment)
average = time.time() - start
return result
for i, row in df.iterrows():
if average < 0.3 : time.sleep(0.3 - average) #0.3 is period time (min= 0.2 max = free)
df['geocode_data'][i] = reverseGeocode(df['lat'][i].astype(str) + ',' + df['lon'][i].astype(str))
for i, row in df.iterrows():
if 'address_components' in row['geocode_data']:
for component in row['geocode_data']['address_components']:
df['address'][i] = row['geocode_data']['formatted_address']
for component in row['geocode_data']['address_components']:
if 'street_number' in component['types']:
df['street_number'][i] = component['long_name']
for component in row['geocode_data']['address_components']:
if 'route' in component ['types']:
df['street_name'][i] = component['long_name']
break
for component in row['geocode_data']['address_components']:
if 'route' in component ['types']:
df['street_name'][i] = component['long_name']
for component in row['geocode_data']['address_components']:
if 'postal_code' in component ['types']:
df['postalcode'][i] = component['short_name']
break
for component in row['geocode_data']['address_components']:
if 'locality' in component ['types']:
df['city'][i]= component['short_name']
break
for component in row['geocode_data']['address_components']:
if 'administrative_area_level_1' in component ['types']:
df['st_pr_mn'][i] = component ['long_name']
break
for component in row['geocode_data']['address_components']:
if 'country' in component ['types']:
df['country'][i] = component ['long_name']
break
for component in row['geocode_data']['geometry']:
if component['location']:
df['location_lng'][i] = int(component['location']['lng'])
df['location_lat'][i] = int(component['location']['lat'])
df.to_csv('test10.csv', encoding='utf-8', index=False)
{&#39;几何&#39;:{&#39;视口&#39;:{&#39;西南&#39;:{&#39; lng&#39;:4.947849719708499,&#39; lat& #39;:52.36571761970851},&#39;东北&#39;:{&#39; lng&#39;:4.950547680291502,&#39; lat&#39;:52.3684155802915}},&#39; location&#39;: {&#39; lng&#39;:4.9491987,&#39; lat&#39;:52.3670666},&#39; location_type&#39;:&#39; ROOFTOP&#39;},&#39; address_components&#39 ;:[{&#39; long_name&#39;:&#39; 114&#39;,&#39; types&#39;:[&#39; street_number&#39;],&#39; short_name&#39; :&#39; 114&#39;},{&#39; long_name&#39;:&#39; Zeeburgerpad&#39;,&#39; types&#39;:[&#39; route&#39;], &#39; short_name&#39;:&#39; Zeeburgerpad&#39;},{&#39; long_name&#39;:&#39; Amsterdam-Oost&#39;,&#39; types&#39;:[ &#39; sublocality_level_1&#39;,&#39; sublocality&#39;,&#39; political&#39;],&#39; short_name&#39;:&#39; Amsterdam-Oost&#39;},{ &#39; long_name&#39;:&#39;阿姆斯特丹&#39;,&#39;类型&#39;:[&#39;地点&#39;,&#39;政治&#39;],&#39; ; short_name&#39;:&#39;阿姆斯特丹&#39;},{&#39; long_name&#39;:&#39; Amsterda m&#39;,&#39;类型&#39;:[&#39; administrative_area_level_2&#39;,&#39;政治&#39;],&#39; short_name&#39;:&#39;阿姆斯特丹&#39; ;},{&#39; long_name&#39;:&#39; Noord-Holland&#39;,&#39; types&#39;:[&#39; administrative_area_level_1&#39;,&#39; political&#39; ;],&#39; short_name&#39;:&#39; NH&#39;},{&#39; long_name&#39;:&#39;荷兰&#39;&#39;类型&#39;: [&#39; country&#39;,&#39; political&#39;],&#39; short_name&#39;:&#39; NL&#39;},{&#39; long_name&#39;:& #39; 1019 AE&#39;,&#39;类型&#39;:[&#39; postal_code&#39;],&#39; short_name&#39;:&#39; 1019 AE&#39;}], &#39; place_id&#39;:&#39; ChIJD14pyz8JxkcRF1Kpg8opql4&#39;,&#39; formatted_address&#39;:&#39; Zeeburgerpad 114,1019 AE Amsterdam,Netherlands&#39;,&#39; types&#39 ;:[&#39; street_address&#39;]}
答案 0 :(得分:0)
在您的代码中发现可能的拼写错误:
public ActionResult Save(EditableBaseModel editableBaseModel) {
var baseModel = new BaseModel() {
Content = editableBaseModel.Content
// ...
};
myDbContextInstance.BaseModels.Add(baseModel);
}
我相信你想在那里寻找组件['lng']。
如果这不能解决问题,你可以打印出df ['location_lng'],这样我们就可以确切地看到它是什么类型了吗?
答案 1 :(得分:0)
您的“位置”路径似乎不正确。试试这个:
for component in row['geocode_data']['geometry']:
if component['location']: # Note: NOT: if 'location' in component ['...']
df['location_lng'][i] = component['location']['lng']
df['location_lat'][i] = component['location']['lat']
更新: 看来实际问题是输出'数组'正在使用。 您正在分配以下输出:
df['location_lat']=''
然后你分配:
df['location_lat'][i]=...
实际上是:
''[i] = ...
将导致注明Type错误。 [我得到“KeyError:'location_lat'”,但是嘿......]
也许尝试类似的事情:
results = []
...
for ...
df = {}
...
df['location_lng'] = component['location']['lng']
...
results.append(df)
更新更新: 没有使用过Pandas,我不确定如何反映df csv格式;不过,我将推测,如果您删除“为输出CSV创建新列”部分,您的DataFrame结构将不会被破坏。 即:删除:
df['geocode_data'] = ''
df['address']=''
等
我可能会在错误的树上吠叫。
另一个更新: 我的简单测试用例是:
js = ... (json string as above)
data = json.loads(js)
component = data['geometry']
if component['location']:
val = component['location']['lat']
print val
适用于提取纬度。 所以问题不应该是'if'部分。
更新(再次......) 好的 - 再试一次:而不是:
df['geocode_data'][i] = reverseGeocode(...)
直接分配给字典变量进行数据提取。即:
data = reverseGeocode(...)
然后如上所述提取数据并根据需要分配给您的DataFrame。