在解析之前,lxml不会将int强制转换为字符串

时间:2018-02-01 16:18:21

标签: python xml parsing

我正在尝试使用python从API调用中提取Qualys漏洞报告ID。实质上,报告ID是int,lxml只能解析字符串。我过去使用过相同的代码来完成这项工作并且工作正常。我认为lxml足够聪明,可以在解析前将int强制转换为string。有没有办法我可以手动执行此操作,以便停止解析错误?下面是我的代码,输出和回溯。

Code:

import requests
import time
import lxml
from lxml import etree

s = requests.Session()
s.headers.update({'X-Requested-With':'X'})

def login(s):
    payload = {'action':'login', 'username':'X', 'password':'X'}

    r = s.post('https://qualysapi.qualys.com/api/2.0/fo/session/', 
    data=payload)

def launchReport(s, polling_delay=250):
    payload = {'action':'launch', 'template_id':'X', 
    'output_format':'xml', 'report_title':'X'}

    r = s.post('https://qualysapi.qualys.com/api/2.0/fo/report/', 
    data=payload)
    print r.text
    extract_id = etree.XML(r).find('.//VALUE')
    print("Report ID = %s" % extract_id)
    time.sleep(polling_delay)
    return extract_id

login(s)
launchReport(s)

Output:

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE SIMPLE_RETURN SYSTEM 
"https://qualysapi.qualys.com/api/2.0/simple_return.dtd">
<SIMPLE_RETURN>
  <RESPONSE>
    <DATETIME>2018-02-01T16:00:14Z</DATETIME>
    <TEXT>New report launched</TEXT>
    <ITEM_LIST>
      <ITEM>
        <KEY>ID</KEY>
        <VALUE>16441920</VALUE>
      </ITEM>
    </ITEM_LIST>
  </RESPONSE>
</SIMPLE_RETURN>

Traceback:

Traceback (most recent call last):
  File "test.py", line 30, in <module>
    launchReport(s)
  File "test.py", line 22, in launchReport
    extract_id = etree.XML(r).find('.//VALUE')
  File "src/lxml/etree.pyx", line 3209, in lxml.etree.XML 
(src/lxml/etree.c:80823)
  File "src/lxml/parser.pxi", line 1870, in 
lxml.etree._parseMemoryDocument (src/lxml/etree.c:121231)
ValueError: can only parse strings

1 个答案:

答案 0 :(得分:1)

您正在尝试解析响应对象而不是响应中的数据。将<?php use VendorX\MailPackageA; \\ also tried the `use VendorX\MailPackageA as MailPackageA;` pattern to no effect class ThirdPartyMailMod implements CoreMailSystem { $mail_object = new MailPackageA\Mail(); } 更改为etree.XML(r)