如何使用urllib2更新此代码到Python3?

时间:2017-01-18 14:58:15

标签: python python-2.7 python-3.x ipython-notebook

我现在的教授在课堂上使用Python 2.7作为例子,但是我将来将要上课的其他教授建议我使用Python 3.5。我试图将我目前教授的例子从2.7转换为3.5。现在我遇到了urllib2包的问题,​​我理解它已经在Python 3中被拆分了。

iPython笔记本中的原始代码如下所示:

import csv
import urllib2

data_url = 'http://archive.ics.uci.edu/ml/machine-learning-    databases/adult/adult.data'
response = urllib2.urlopen(data_url)

myreader = csv.reader(response)
for i in range(5):
    row = next(myreader)
   print ','.join(row)

我已转换为:

import csv
import urllib.request

data_url = 'http://archive.ics.uci.edu/ml/machine-learning-  databases/adult/adult.data'
response = urllib.request.urlopen(data_url)
myreader = csv.reader(response)
for i in range(5):
    row = next(myreader)
    print(','.join(row))

但这让我有错误:

Error                                     Traceback (most recent call last)
<ipython-input-19-20da479e256f> in <module>()
      7 myreader = csv.reader(response)
      8 for i in range(5):
----> 9     row = next(myreader)
     10     print(','.join(row))

Error: iterator should return strings, not bytes (did you open the file in text mode?)

我不确定如何从这里开始。有什么想法吗?

1 个答案:

答案 0 :(得分:1)

用另一个将字节解码为字符串并生成字符串的迭代器包装response

import csv
import urllib.request

def decode_iter(it):
    # iterate line by line
    for line in it:
        # convert bytes to string  using `bytes.decode`
        yield line.decode()

data_url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data'
response = urllib.request.urlopen(data_url)
myreader = csv.reader(decode_iter(response))
for i in range(5):
    row = next(myreader)
    print(','.join(row))

<强>更新

您可以使用codecs.iter_decode

而不是decode_iter
import csv
import codecs
import urllib.request

data_url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data'
response = urllib.request.urlopen(data_url)
myreader = csv.reader(codecs.iterdecode(response, 'utf-8'))
for i in range(5):
    row = next(myreader)
    print(','.join(row))