我正在尝试根据网址抓取地理位置,经过大约500次搜索并提取地理位置后,我收到编码错误。我在代码中包含了编码utf-8
,并在cmd中也遵循了以下命令。
chcp 65001
set PYTHONIOENCODING=utf-8
然而我收到以下错误:
Traceback (most recent call last):
File "__main__.py", line 33, in <module>
outputfile.write(newline)
File "C:\Program Files\Anaconda3\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufe0f' in position 62: character maps to <undefined>
我在所有软件包的anaconda上使用Python 3.x更新版本。
#!/usr/bin/python
import sys
from twitter_location import get_location
from google_location import get_coordinates
# Open output file
outputfile = open(sys.argv[2], 'w')
# Read input file
with open(sys.argv[1], 'r', encoding = "utf-8", errors='ignore') as csv:
# Skip headers line
next(csv)
# Loop in lines
for line in csv:
# Extract userid
print (line)
permalink = line.split(',')[-1].strip()
userid = permalink.split('/')[3]
# Get location as string if exists
location = get_location(userid)
if location is None:
print ('user {} can not be reached or do not exposes any location.'.format(userid))
continue
else:
# If location is ok, get coordinates
coordinates = get_coordinates(location)
print ('{}: {}'.format(userid, coordinates))
# Copy current input line and add coordinates at the end
newline = '{},{}\n'.format(line.strip(), coordinates)
# Write in output file
outputfile.write(newline)
我在这里寻找两件事
帮助解决编码错误
我想在输出中添加输入标题+新列标题
我的输入文件有以下标题
username date retweets favorites text geo mentions hashtags id permalink
在编写输出时,我也可以获得所有列+新的地理坐标列。但我无法将标头放回输出文件中。
感谢您的帮助,提前致谢。