我现在的教授在课堂上使用Python 2.7作为例子,但是我将来将要上课的其他教授建议我使用Python 3.5。我试图将我目前教授的例子从2.7转换为3.5。现在我遇到了urllib2包的问题,我理解它已经在Python 3中被拆分了。
iPython笔记本中的原始代码如下所示:
import csv
import urllib2
data_url = 'http://archive.ics.uci.edu/ml/machine-learning- databases/adult/adult.data'
response = urllib2.urlopen(data_url)
myreader = csv.reader(response)
for i in range(5):
row = next(myreader)
print ','.join(row)
我已转换为:
import csv
import urllib.request
data_url = 'http://archive.ics.uci.edu/ml/machine-learning- databases/adult/adult.data'
response = urllib.request.urlopen(data_url)
myreader = csv.reader(response)
for i in range(5):
row = next(myreader)
print(','.join(row))
但这让我有错误:
Error Traceback (most recent call last)
<ipython-input-19-20da479e256f> in <module>()
7 myreader = csv.reader(response)
8 for i in range(5):
----> 9 row = next(myreader)
10 print(','.join(row))
Error: iterator should return strings, not bytes (did you open the file in text mode?)
我不确定如何从这里开始。有什么想法吗?
答案 0 :(得分:1)
用另一个将字节解码为字符串并生成字符串的迭代器包装response
:
import csv
import urllib.request
def decode_iter(it):
# iterate line by line
for line in it:
# convert bytes to string using `bytes.decode`
yield line.decode()
data_url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data'
response = urllib.request.urlopen(data_url)
myreader = csv.reader(decode_iter(response))
for i in range(5):
row = next(myreader)
print(','.join(row))
<强>更新强>
您可以使用codecs.iter_decode
:
decode_iter
import csv
import codecs
import urllib.request
data_url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data'
response = urllib.request.urlopen(data_url)
myreader = csv.reader(codecs.iterdecode(response, 'utf-8'))
for i in range(5):
row = next(myreader)
print(','.join(row))