Question

我尝试使用ElementTree在<paste_key>中提取值，并且我收到以下错误。任何人都可以帮我看看我做错了什么吗？

from pastebin import PastebinAPI
from xml.etree import cElementTree as ET
import time

x = self.apiobject.pastes_by_user(api_dev_key=self.DEVKEY, api_user_key=self.userkey)
print x
x = ET.fromstring(x)


for key in list(x):
  self.pastekeys.append(key.find('paste_key').text)
print self.pastekeys

错误输出： junk after document element: line 13, column 0

x

中存在的示例数据

<paste>
<paste_key>afafafaf</paste_key>
<paste_date>1508796842</paste_date>
<paste_title>1508796842</paste_title>
<paste_size>36096</paste_size>
<paste_expire_date>0</paste_expire_date>
<paste_private>2</paste_private>
<paste_format_long>None</paste_format_long>
<paste_format_short>text</paste_format_short>
<paste_url>https://pastebin.com/afafafaf</paste_url>
<paste_hits>0</paste_hits>
</paste>
<paste>
<paste_key>asdfasdf</paste_key>
<paste_date>1508796842</paste_date>
<paste_title>1508796842</paste_title>
<paste_size>36096</paste_size>
<paste_expire_date>0</paste_expire_date>
<paste_private>2</paste_private>
<paste_format_long>None</paste_format_long>
<paste_format_short>text</paste_format_short>
<paste_url>https://pastebin.com/asdfasdf</paste_url>
<paste_hits>0</paste_hits>
</paste>
...

Answer 1

如果问题是xml结构，那么试试BeautifulSoup。

如果你的粘贴是一个名为pastebin_string的字符串，它将是这样的：

soup = BeautifulSoup(pastebin_string, "html.parser")
pastes = soup.find_all("paste").
for paste in pastes:
    key = paste.find("paste_key")
    print(key.text)

Answer 2

以下为我工作。感谢@ john-gordon指出

        x = self.apiobject.pastes_by_user(api_dev_key=self.DEVKEY, api_user_key=self.userkey)

        x = x.split("</paste>")
        x = [y + "</paste>\r\n" for y in x]

        for key in x[:-1]:
            paste = ET.fromstring(key)
            self.pastekeys.append(paste.find('paste_key').text)

python元素树提取值不起作用

2 个答案: