使用Pandas进行地理编码 - 根据条件对lambda x应用不同的地理编码

时间:2018-04-25 15:31:36

标签: python pandas lambda geocoding geopy

我正在尝试使用地理编码来提取邮政编码。我有两个不同的数据源应用于地理编码。

一个。首先尝试使用街道地址&城市与城市国家 湾如果没有重新运行,请尝试使用city&国家

我的程序如下所示,如果我只是采用第一种方法,它可以正常工作,但如果我想将它们结合起来就有问题:

from geopy.geocoders import Nominatim
geolocator = Nominatim()
geolocator=Nominatim(timeout=1000)
import io
import sys
# import urllib.request
sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='utf8') #改变标准输出的默认编码
import pandas as pd

df = pd.read_csv("Italy.csv", encoding="cp1252") #改变标准输乳的默认编码
# print(df)

df["Conca_1"]=df["country2"]+" " " "+df["City"]+" " " "+df["Address Line 1"]
df["Conca_2"]=df["country2"]+" " " "+df["City"]
# print(df["Conca2"])
df["Coordinates_1"]=df["Conca_1"].apply(geolocator.geocode)
df["Coordinates_2"]=df["Conca_2"].apply(geolocator.geocode)
# df["zip"] = df['Coordinates_1'].apply(lambda x: x[0].split(',')[-2] if x != None else None)
df["zip"] = df['Coordinates_1'].apply(lambda x: x[0].split(',')[-2] if x != None else df['Coordinates_2'].apply(x[0].split(',')[-2] if x != None else None ))
print(df["zip"])

writer = pd.ExcelWriter('coordinates_result.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
writer.save()

1 个答案:

答案 0 :(得分:0)

自己解决

from geopy.geocoders import Nominatim
geolocator = Nominatim()
geolocator=Nominatim(timeout=1000)
import io
import sys
# import urllib.request
sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='utf8') #改变标准输出的默认编码
import pandas as pd

df = pd.read_csv("1.csv", encoding="cp1252") #改变标准输乳的默认编码
# print(df)

df["Conca_1"]=df["City"]
df["Conca_2"]=df["country2"]+" " " "+df["City"]
# print(df["Conca2"])
df["Coordinates_1"]=df["Conca_1"].apply(geolocator.geocode)
df["Coordinates_2"]=df["Conca_2"].apply(geolocator.geocode)
# df["zip"] = df['Coordinates_1'].apply(lambda x: x[0].split(',')[-2] if x != None else None)

# df["zip"] = df[["Coordinates_1","Coordinates_2"]].apply(lambda x: x["Coordinates_1"][0].split(',')[-2] if x["Coordinates_1"] != None else x["Coordinates_2"][0].split(',')[-2] if x["Coordinates_2"] != None axis = 1)
df["zip"] = df[["Coordinates_1","Coordinates_2"]].apply(lambda x: x["Coordinates_1"][0].split(',')[-2] if x["Coordinates_1"] != None else x["Coordinates_2"][0].split(',')[-2] if x["Coordinates_2"] != None else None, axis = 1)

print(df["zip"])

writer = pd.ExcelWriter('coordinates_result.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
writer.save()