我有一个Pandas数据框,我正在'address_original'列的每一行上调用geopy函数,以获取有关每个地址的尽可能多的详细信息
from geopy.geocoders import GoogleV3
geolocator = GoogleV3(api_key='', timeout=5,)
df['full_address_geocoded'] = df['address_original'].progress_apply(geolocator.geocode())
print df['full_address_geocoded'][:5]
0 (Veale Rd, New Plymouth, New Zealand, (-39.095...
1 (Veale Rd, Delaware, USA, (39.8036422, -75.494...
2 (1068 Clearwater Valley Rd, Clearwater, BC V0E...
3 (1605 Pine St W, Stillwater, MN 55082, USA, (4...
4 None
问题是,我需要的信息比使用地址解析方法时所包含的信息更多。 Geopy还具有.raw属性,该属性可以完美地在单个字符串上运行,并输出有关地址所属区域,区域等的数据。
places = '1238 Davie St, Vancouver, BC'
geolocator = GoogleV3(api_key='AIzaSyBlNIvZTk-BpWDeX1FFXPbx6QwbNzZL80w')
location = geolocator.geocode(places, language='en')
print location.raw
{u'geometry': {u'location_type': u'ROOFTOP', u'bounds': {u'northeast': {u'lat': 50.6539919, u'lng': -120.3383232}, u'southwest': {u'lat': 50.6538198, u'lng': -120.3386968}}, u'viewport': {u'northeast': {u'lat': 50.6552548302915, u'lng': -120.3371610197085}, u'southwest': {u'lat': 50.6525568697085, u'lng': -120.3398589802915}}, u'location': {u'lat': 50.6539239, u'lng': -120.3385242}}, u'address_components': [{u'long_name': u'142', u'types': [u'street_number'], u'short_name': u'142'}, {u'long_name': u'Waddington Drive', u'types': [u'route'], u'short_name': u'Waddington Dr'}, {u'long_name': u'Upper Sahali', u'types': [u'neighborhood', u'political'], u'short_name': u'Upper Sahali'}, {u'long_name': u'Kamloops', u'types': [u'locality', u'political'], u'short_name': u'Kamloops'}, {u'long_name': u'Thompson-Nicola', u'types': [u'administrative_area_level_2', u'political'], u'short_name': u'Thompson-Nicola'}, {u'long_name': u'British Columbia', u'types': [u'administrative_area_level_1', u'political'], u'short_name': u'BC'}, {u'long_name': u'Canada', u'types': [u'country', u'political'], u'short_name': u'CA'}, {u'long_name': u'V2E 1N3', u'types': [u'postal_code'], u'short_name': u'V2E 1N3'}], u'place_id': u'ChIJaV6wtDgsflMR1J9ReM0lzLs', u'formatted_address': u'142 Waddington Dr, Kamloops, BC V2E 1N3, Canada', u'types': [u'premise']}
这是数据框的一个示例
print df.info()
print type(df)
print df['address_original'][:5]
print type(df['address_original'])
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2679 entries, 0 to 2678
Columns: 111 entries, access to zoning
dtypes: object(111)
memory usage: 2.3+ MB
None
<class 'pandas.core.frame.DataFrame'>
0 184,VEALE ROAD, Canada
1 124,VEALE ROAD, Canada
2 1068,CLEARWATER VALLEY ROAD, Canada
3 1605,PINE STREET, Canada
4 1425,LOPEZ CREEK DRIVE, Canada
Name: address_original, dtype: object
<class 'pandas.core.series.Series'>
原始方法不适用于多个值,但是,我得到TypeError:'dict'对象不可调用。
如何解决这个问题?