我该怎么做才能只获取邮政编码,而不是整个地址?现在,它显示了一个包含邮政编码的完整地址,是否可以提取该邮政编码?
import pandas as pd
import matplotlib.pyplot as plt
import numpy as py
from tabulate import tabulate
from geopy.geocoders import Nominatim
geolocator = Nominatim()
my_data = pd.read_csv('dt/TrafficCounts_OpenData_wm.csv')
geolocator = Nominatim(user_agent="my_application")
sub_set = my_data[["POINT_Y","POINT_X"]]
count = 0
for y in sub_set.itertuples() :
mypoint = str(y[1]) + ' ,' + str(y[2])
print(mypoint)
location = geolocator.reverse(mypoint)
print(location)
if count == 5 : break
count +=1
答案 0 :(得分:0)
由于邮政编码始终是地址中的最后5位数字或5加4位数字,因此您可以使用以下正则表达式从location
变量中存储的地址中提取邮政编码:
import re
zipcode = re.search(r'\d{5}(?:-\d{4})?(?=\D*$)', location).group()
答案 1 :(得分:0)
如果您不了解正则表达式,我想您可以做类似的事情,但是您应该了解它们,它们会为您提供更可靠的行为。
data ='''29.607416999999998 ,-95.114007 Pinebrook KinderCare, 4422,Clear Lake City Boulevard, Houston, Harris County, Texas,77059,USA
29.74770501 ,-95.39656199 2345, Commonwealth Street, Houston, Harris County, Texas, 77006, USA
29.707028 ,-95.59624701 Hastings Ninth Grade Center, 6750, Cook Road, Houston, Harris County, Texas, 77072, USA
29.59038673 ,-95.47975719 6333, Court Road, Houston, Fort Bend County, Texas, 77053, USA
29.67591366 ,-95.32867835 7084, Crestmont Street, Houston, Harris County, Texas, 77033, USA'''
dl = data.split('USA')
# print(dl)
# 1)
zip_code_lst = []
for addrs in dl:
zip_found = addrs.rstrip(', ')[-5:] # --> 77006,whitspace --> 77006
if len(zip_found) == 5:
zip_code_lst.append(zip_found)
print(zip_code_lst) # ['77059', '77006', '77072', '77053', '77033']
# 2)
zip_code_lst_comp = [ addrs.rstrip(', ')[-5:] for addrs in dl ]
print(zip_code_lst_comp) # ['77059', '77006', '77072', '77053', '77033', '']