我正在learnpythonthehardway进行练习41并继续收到错误:
Traceback (most recent call last):
File ".\url.py", line 72, in <module>
question, answer = convert(snippet, phrase)
File ".\url.py", line 50, in convert
result = result.replace("###", word, 1)
TypeError: Can't convert 'bytes' object to str implicitly
我使用python3而书籍使用python2,所以我做了一些改动。这是脚本:
#!/usr/bin/python
# Filename: urllib.py
import random
from random import shuffle
from urllib.request import urlopen
import sys
WORD_URL = "http://learncodethehardway.org/words.txt"
WORDS = []
PHRASES = {
"class ###(###):":
"Make a class named ### that is-a ###.",
"class ###(object):\n\tdef __init__(self, ***)" :
"class ### has-a __init__ that takes self and *** parameters.",
"class ###(object):\n\tdef ***(self, @@@)":
"class ### has-a funciton named *** that takes self and @@@ parameters.",
"*** = ###()":
"Set *** to an instance of class ###.",
"***.*** = '***'":
"From *** get the *** attribute and set it to '***'."
}
# do they want to drill phrases first
PHRASE_FIRST = False
if len(sys.argv) == 2 and sys.argv[1] == "english":
PHRASE_FIRST = True
# load up the words from the website
for word in urlopen(WORD_URL).readlines():
WORDS.append(word.strip())
def convert(snippet, phrase):
class_names = [w.capitalize() for w in
random.sample(WORDS, snippet.count("###"))]
other_names = random.sample(WORDS, snippet.count("***"))
results = []
param_names = []
for i in range(0, snippet.count("@@@")):
param_count = random.randint(1,3)
param_names.append(', '.join(random.sample(WORDS, param_count)))
for sentence in snippet, phrase:
result = sentence[:]
# fake class names
for word in class_names:
result = result.replace("###", word, 1)
# fake other names
for word in other_names:
result = result.replace("***", word, 1)
# fake parameter lists
for word in param_names:
result = result.replace("@@@", word, 1)
results.append(result)
return results
# keep going until they hit CTRL-D
try:
while True:
snippets = list(PHRASES.keys())
random.shuffle(snippets)
for snippet in snippets:
phrase = PHRASES[snippet]
question, answer = convert(snippet, phrase)
if PHRASE_FIRST:
question, answer = answer, question
print(question)
input("> ")
print("ANSWER: {}\n\n".format(answer))
except EOFError:
print("\nBye")
我到底错在了什么?谢谢!
答案 0 :(得分:30)
urlopen()
返回一个bytes对象,对它执行字符串操作,你应该先将它转换为str
。
for word in urlopen(WORD_URL).readlines():
WORDS.append(word.strip().decode('utf-8')) # utf-8 works in your case
获取正确的字符集:How to download any(!) webpage with correct charset in python?
答案 1 :(得分:14)
在Python 3中,urlopen
function返回一个HTTPResponse
对象,其作用类似于二进制文件。所以,当你这样做时:
for word in urlopen(WORD_URL).readlines():
WORDS.append(word.strip())
...最终得到一堆bytes
个对象而不是str
个对象。所以当你这样做时:
result = result.replace("###", word, 1)
...您最终尝试使用"###"
对象替换字符串result
中的字符串bytes
,而不是str
。因此错误:
TypeError: Can't convert 'bytes' object to str implicitly
答案是在获得这些单词后立即对其进行明确解码。为此,您必须从HTTP标头中找出正确的编码。你是怎么做到的?
在这种情况下,我读了标题,我可以说它是ASCII,它显然是一个静态页面,所以:
for word in urlopen(WORD_URL).readlines():
WORDS.append(word.strip().decode('ascii'))
但在现实生活中,您通常需要编写读取标题并动态计算出来的代码。或者,更好的是,安装requests
等更高级别的库,does that for you automatically。
答案 2 :(得分:0)
将字节类型'word'显式转换为字符串
result = result.replace("###", sre(word), 1)
它应该有用