从CSV文件对地址列表进行地理编码

时间:2013-11-12 18:13:27

标签: python csv geocoding

我正在尝试对CSV文件中的地址列表进行地理编码。带有信息的列名为“full”,它看起来像这样:

full
100 Ross St,15219
1014 Blackadore Ave.,15208
1026 Winterton St.,15206
...

这是我正在使用的代码:

import csv
import pygeocoder
import pandas as pd

from pygeocoder import Geocoder

df = pd.read_csv('C:\Users\Jesus\Desktop\Events.csv')
address = df.full

result = Geocoder.geocode(address)
print(result[0].coordinates)

这是输出:

Traceback (most recent call last):
  File "C:\Users\Jesus\Desktop\python\geocode.py", line 10, in <module>
    result = Geocoder.geocode(address)
  File "C:\Python27\lib\site-packages\pygeocoder.py", line 160, in geocode
    return GeocoderResult(Geocoder.get_data(params=params))
  File "C:\Python27\lib\site-packages\pygeocoder.py", line 107, in get_data
    response_json = response.json()
  File "C:\Python27\lib\site-packages\requests\models.py", line 693, in json
    return json.loads(self.text, **kwargs)
  File "C:\Python27\lib\site-packages\simplejson\__init__.py", line 488, in loads
    return _default_decoder.decode(s)
  File "C:\Python27\lib\site-packages\simplejson\decoder.py", line 370, in decode
    obj, end = self.raw_decode(s)
  File "C:\Python27\lib\site-packages\simplejson\decoder.py", line 389, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
  File "C:\Python27\lib\site-packages\simplejson\scanner.py", line 122, in scan_once
    return _scan_once(string, idx)
  File "C:\Python27\lib\site-packages\simplejson\scanner.py", line 118, in _scan_once
    raise JSONDecodeError(errmsg, string, idx)
JSONDecodeError: Expecting value: line 1 column 1 (char 0

1 个答案:

答案 0 :(得分:1)

您的address变量是来自pandas的Series对象,可能会导致此问题。要对CSV中的所有地址进行地理编码,请按以下方式进行迭代:

for a in address:
    result = Geocoder.geocode(a)
    print(result[0].coordinates)

将结果存储在一个文件中(假设是Python 2.x):

with open('filename', 'w') as outfile:
    for a in address:
        result = Geocoder.geocode(a)
        print >>outfile, str(result[0].coordinates) # Prints to file

如果需要,您可以执行outfile.write( str(result[0].coordinates) )而不是打印。唯一的区别是print会自动添加换行符。要添加到列表,只需在for语句之外声明列表(例如coordinates_list = []),然后将print替换为coordinates_list.append(result[0].coordinates)。这些方法中的任何一种都可以在Python 3.x中使用,但print >>outfile语句不会。