original_tweet= 'I luv my <3 iphone & you’re awsm apple. DisplayIsAwesome, sooo happppppy http://www.apple.com”
import HTMLParser
html_parser = HTMLParser.HTMLParser()
tweet = html_parser.unescape(original_tweet)
UnicodeDecodeError Traceback (most recent call last) <ipython-input-12-58919c61b71f> in <module>() ----> 1 tweet = html_parser.unescape(original_tweet) 2 tweet C:\Users\vntja\Anaconda2\ds\lib\HTMLParser.pyc in unescape(self, s) 474 return '&'+s+';' 475 --> 476 return re.sub(r"&(#?[xX]?(?:[0-9a-fA-F]+|\w{1,8}));", replaceEntities, s) C:\Users\vntja\Anaconda2\ds\lib\re.pyc in sub(pattern, repl, string, count, flags) 153 a callable, it's passed the match object and must return 154 a replacement string to be used.""" --> 155 return _compile(pattern, flags).sub(repl, string, count) 156 157 def subn(pattern, repl, string, count=0, flags=0): UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 4: ordinal not in range(128)
答案 0 :(得分:0)
在python脚本的顶部添加此行
# -*- coding: utf-8 -*-
您正在尝试将某些内容解码为未在ASCII中定义的ASCII。