我正在使用pygeocoder对地址列表进行地理编码。这是我的代码:
import csv
import pandas as pd
from pygeocoder import Geocoder
from pygeocoder import GeocoderError
df = pd.read_csv('C:\Users\L\Desktop\germanfdiaddress.csv', encoding="iso-8859-1")
address = df.Address
print address
add=[]
lat=[]
lng=[]
pcode=[]
for a in address:
try:
result = Geocoder.geocode(a)
lat.extend([result[0].coordinates[0]])
lng.extend([result[0].coordinates[1]])
pcode.extend([result[0].postal_code])
except GeocoderError:
continue
result = Geocoder.geocode(a)
lat.extend([result[0].coordinates[0]])
lng.extend([result[0].coordinates[1]])
pcode.extend([result[0].postal_code])
fields= 'add','lat', 'lng', 'pcode'
rows=zip(address,lat,lng,pcode)
with open('C:\Users\L\Desktop\myfile.csv', 'wb') as outfile:
w = csv.writer(outfile)
w.writerow(fields)
for i in rows:
w.writerow(i)
但是我收到以下错误:
Traceback (most recent call last):
File "C:\Users\Jesus\Dropbox\coding\python\geocoder with uft-8, with complete output.py", line 42, in <module>
w.writerow(i)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 13: ordinal not in range(128)
有关正在发生的事情的任何想法?我知道我的代码工作,除了写入csv文件。
以下是csv文件:https://www.dropbox.com/s/6yprg2u1ghuygye/germanfdiaddress.csv
答案 0 :(得分:1)
所以我刚刚更改了unicodecsv的csv模块,它完美无缺。这是新代码:
import unicodecsv
import pandas as pd
from pygeocoder import Geocoder
from pygeocoder import GeocoderError
df = pd.read_csv('C:\Users\L\Desktop\germanfdiaddress.csv', encoding="iso-8859-1")
address = df.Address
print address
add=[]
lat=[]
lng=[]
pcode=[]
for a in address:
try:
result = Geocoder.geocode(a)
lat.extend([result[0].coordinates[0]])
lng.extend([result[0].coordinates[1]])
pcode.extend([result[0].postal_code])
except GeocoderError:
continue
fields= 'add','lat', 'lng', 'pcode'
rows=zip(address,lat,lng,pcode)
with open('C:\Users\L\Desktop\myfile.csv', 'wb') as outfile:
w = unicodecsv.writer(outfile, encoding='iso-8859-1')
w.writerow(fields)
for i in rows:
w.writerow(i)
答案 1 :(得分:0)
csv模块存在编码以外的编码问题:
此版本的csv模块不支持Unicode输入。也, 目前有一些关于ASCII NUL字符的问题。 因此,所有输入应为UTF-8或可打印的ASCII以确保安全;
在进行简单的读写操作时,您可以使用documentation中的示例UnicodeWriter
类。
或者,您可以简化代码:
import codecs
# ...
with codecs.open(r'C:\Users\L\Desktop\myfile.csv',
mode='w', encoding='utf-8') as outfile:
outfile.write('{}\n'.format(','.join(fields)))
for i in rows:
outfile.write('{}\n'.format(','.join(i)))
使用r'C:\Users\L\Desktop\myfile.csv'
作为路径分隔符时,请使用原始字符串\
。这是为了防止'C:\newfile
之类的内容被错误地解释。
您还可以使用正斜杠(即使在Windows中),这将消除使用原始字符串的需要。
或者,您可以使用os.path.join
来构建文件路径。
关键是,避免使用\
。
答案 2 :(得分:0)
为了拥有更干净的Pythonic外观,您可以在GitHub上使用Geocoder&amp; PyPi而不是pygeocoder,也是为了处理Unicode问题UnicodeCSV真是太神奇了,你可以保持对DictWriter&amp; amp; DictReader,这是一个代码示例:
import geocoder
import unicodecsv
import logging
# CSV Writer
csvfile = open('address_out.csv', 'wb')
fieldnames = ['source', 'address', 'lat', 'lng', 'postal']
writer = unicodecsv.DictWriter(csvfile, fieldnames=fieldnames, encoding='utf-8')
writer.writeheader()
# CSV Reader
with open('address.csv', 'rb') as f:
reader = unicodecsv.DictReader(f, encoding='iso-8859-1')
for line in reader:
address = line['Address']
# Geocoding
g = geocoder.google(address)
if g.ok:
row = {}
row['source'] = address
row['address'] = g.address
row['lat'] = g.lat
row['lng'] = g.lng
row['postal'] = g.postal
writer.writerow(row)
logging.info('Geocoding SUCCESS: ' + address)
else:
logging.warning('Geocoding ERROR: ' + address)