消除\ xa0返回UnicodeEncodeError

时间:2018-03-06 08:29:24

标签: python xpath unicode

我使用Xpath获取数据,输出具有GlideApp.with(this) .load(imageUrl) .roundedCorners(getApplicationContext(), 5) .into(imageView); ,即Unicode。我想消除它,但它会返回:

'\xa0'

这是我的代码:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 0: ordinal not in range(128)

为了消除page_active = requests.get('http://www.marketinout.com/stock-screener/stocks.php?list=volume_leaders&exch=asx') active = html.fromstring(page_active.content) data = active.xpath('//tbody/tr/td/text()') data >>> [u'\xa0', u'\xa0', u'\xa0Bard1 Life Sciences Limited ', u'\xa0Gold', u'\xa0Basic Materials', u'\xa0ASX', u'\xa07', u'\xa00.025', u'\xa00.015', u'\xa0150.0', u'\xa02 78,097,367', u'\xa0', u'\xa0', u'\xa0Patrys Ltd ...] ,我尝试'\xa0',但它返回:

[a.replace('\xa0',' ') for a in data]

我也使用了UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 0: ordinal not in range(128) ,但我仍然遇到同样的错误。

2 个答案:

答案 0 :(得分:1)

您需要告诉Python将您的字符串解释为Unicode。

为此,请在字符串前添加u

[a.replace(u'\xa0', u' ') for a in data]

答案 1 :(得分:0)

你正在混合字节和Unicode,不要这样做。请改用Unicode字符串文字:

[a.replace(u'\xa0', u' ') for a in data]

否则,Python将尝试将字节字符串'\xa0'解码为ASCII,而0xA0不是有效的ASCII码点。

或者,使用unicode.strip()删除尾随和前导空格; U + 00A0代码点计为空格:

[a.strip() for a in data]