如何解决“不支持带有编码声明的Unicode字符串”。

时间:2019-09-07 11:04:11

标签: python parsing lxml

ValueError: Unicode strings with encoding declaration are not supported. 
Please use bytes input or XML fragments without declaration.

当我尝试解析该网站时不起作用。

当我尝试序列化此页面文本时,出现错误 TypeError: Type 'str' cannot be serialized

from lxml import html

source = 'http://games.chruker.dk/eve_online/item.php?type_id=814'
path = '//*[@id="top"]/table[1]/tbody/tr[1]/td[3]/table'

page = requests.get(source)
pagetext = page.text

parser = html.fromstring(pagetext)

result = parser.xpath(path)
print(result)


我希望有一个类似网站中的表格要求: http://games.chruker.dk/eve_online/item.php?type_id=814

2 个答案:

答案 0 :(得分:4)

尝试一下:

parser = html.fromstring(bytes(pagetext, encoding='utf8'))

答案 1 :(得分:1)

API提供的parse函数使您可以像在source变量中一样直接输入URL:

from lxml import html

source = 'http://games.chruker.dk/eve_online/item.php?type_id=814'
path = '//*[@id="top"]/table[1]/tbody/tr[1]/td[3]/table'

tree = html.parse(source)

result = tree.xpath(path)

print(result)