应用错误收集

以下是我对网络抓取程序的设置：

import urllib
import urllib.request
import re

url = http://roosters5.gepro-osi.nl/roosters/rooster.php?&school=x with x being the number of the school
htmlfile = urllib.request.urlopen(url)
htmltext = htmlfile.read().decode('utf-8')

通常，这会返回html文件，我可以使用re.findall()搜索我想要的内容，但是，这并不适用于某些学校编号，例如218。

导致这种情况的原因以及我该如何解决这个问题？

如果在python中没有utf-8，我如何解码网站？

0 个答案: