我编写了以下python脚本,用于从Yahoo Finance网站检索信息并将其存储在一个文件中。以下是剧本:
import urllib.request
from bs4 import BeautifulSoup
in_data = open('list_of_companies.txt','r',encoding='utf-8', errors='ignore')
for line in in_data:
page = urllib.request.urlopen('http://finance.yahoo.com/rss/headline?s='+line.strip())
page = page.read()
page = page.decode('utf-8','ignore')
soup = BeautifulSoup(page)
ans = line[0:len(line.strip())]
ans += '.txt'
f1 = open(ans,'w')
f1.write(soup.prettify())
f1.close()
它将从list_of_companies.txt
获取输入符号并将其存储在具有该名称的文件中。但是,当我运行此脚本时,这会给我以下错误:
Traceback (most recent call last):
File "C:\Users\Darshil Babel\Desktop\NewsContent\s.py", line 12, in <module>
f1.write(soup.prettify())
File "C:\Python34\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u03cb' in position 1556: character maps to <undefined>
我正在使用 urllib 和 BeautifulSoup 模块。 有人可以帮我吗?
答案 0 :(得分:1)
您打开输出文件时未指定与Unicode兼容的编码。尝试
f1 = open(ans,'w', encoding='utf-8')