Question

我正在使用BeautifulSoup来抓取数据。我要写的文字是＃48;€48,50＆＃34;，其中包含一个ascii字符。但是，我想用零替换欧元符号，以便最终输出为＆＃34; 48,50＆＃34;。我一直在收到错误，因为控制台无法打印它。我在Windows上使用python 2.7。我将很感激解决方案。

我基本上遇到了错误，不知道如何解决这个问题。或者有没有办法可以单独提取非ascii字符？

w= item.find_all("div",{"class":"product-price"}).find("strong",
{"class":"product-price__money"}).text.replace("\\u20ac"," ")
print w

Answer 1

您需要解码字符串并将replace函数传递给unicode字符串。

text = "€ 48,50"
w = text.decode("utf-8").replace(u"\u20ac"," ")
print w

有关详细信息，请参阅How to replace unicode characters in string with something else python?。

如何单独刮取非ASCII字符

1 个答案: