我遇到了文件编码的问题。它的工作原理,但问题是我要用BS4编码的文件进行解析。
# encoding: utf-8
import codecs
from bs4 import BeautifulSoup
f1 = codecs.open("1.txt", "r", "utf-8")
text = f1.read()
soup = BeautifulSoup(text.encode('utf-8'))
for tr in soup.find_all('tr'):
zeit = tr.find('td', class_='zeit').get_text(strip=True)
system = tr.find('td', class_='system').get_text(strip=True)
fehlertext = tr.find('td', class_='fehlertext').get_text(strip=True)
print zeit, system, fehlertext
Result: UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 27: ordinal not in range(128)
-bash-3.2$