IOError将Request.content请求传递给lxml.etree.parse()

时间:2015-08-17 19:51:03

标签: python lxml

我在网页上有以下xml -

Calendar date = new GregorianCalendar();
date.add(Calendar.HOUR, -2);

我正在尝试解析这些数据。我有一个通过请求 -

<entry>
    <id>1750</id>
    <title>variablename</title>
    <source>
      com.tidalsoft.webclient.tes.dsp.db.datatypes.Variable
    </source>
    <tes:variable>
        <tes:ownername>ownergroup</tes:ownername>
        <tes:productiondate>2015-08-17T00:00:00-0400</tes:productiondate>
        <tes:readonly>N</tes:readonly>
        <tes:publish>N</tes:publish>
        <tes:description>
          Decription Here
        </tes:description>
        <tes:startcalendar>0</tes:startcalendar>
        <tes:ownerid>666</tes:ownerid>
        <tes:type>1</tes:type>
        <tes:lastusermodifiedtime>2015-06-15T15:42:27-0400</tes:lastusermodifiedtime>
        <tes:innervalue>\\share\location</tes:innervalue>
        <tes:calc>N</tes:calc>
        <tes:name>variablename</tes:name>
        <tes:startdate>1899-12-30T00:00:00-0500</tes:startdate>
        <tes:pub>Y</tes:pub>
        <tes:lastvalue>\\share\location</tes:lastvalue>
        <tes:id>1750</tes:id>
        <tes:startdateasstring>18991230000000</tes:startdateasstring>
        <tes:lastchangetime>2015-06-15T15:42:27-0400</tes:lastchangetime>
        <tes:clientcachelastchangetime>2015-08-17T09:56:49-0400</tes:clientcachelastchangetime>
    </tes:variable>
</entry>

但是当我尝试解析内容时,我会收到错误。

r = requests.get(url, auth=('username', 'password'))

在最后一行中,引号之间的内容是开头所说的字符串 -

>>> xmlObject = etree.parse(r.content) Traceback (most recent call last): File "apiTest.py", line 46, in <module> xmlObject = etree.parse(r.content) File "lxml.etree.pyx", line 3310, in lxml.etree.parse (src\lxml\lxml.etree.c:7 2517) File "parser.pxi", line 1791, in lxml.etree._parseDocument (src\lxml\lxml.etre e.c:105979) File "parser.pxi", line 1817, in lxml.etree._parseDocumentFromURL (src\lxml\lx ml.etree.c:106278) File "parser.pxi", line 1721, in lxml.etree._parseDocFromFile (src\lxml\lxml.e tree.c:105277) File "parser.pxi", line 1122, in lxml.etree._BaseParser._parseDocFromFile (src \lxml\lxml.etree.c:100227) File "parser.pxi", line 580, in lxml.etree._ParserContext._handleParseResultDo c (src\lxml\lxml.etree.c:94350) File "parser.pxi", line 690, in lxml.etree._handleParseResult (src\lxml\lxml.e tree.c:95786) File "parser.pxi", line 618, in lxml.etree._raiseParseError (src\lxml\lxml.etr ee.c:94818) IOError: Error reading file ''

数据以内容类型提供:text / xml

1 个答案:

答案 0 :(得分:4)

etree.parse期望文件名,类文件对象或URL作为其第一个参数(请参阅help(etree.parse))。它不期望XML字符串。要解析XML字符串,请使用

xmlObject = etree.fromstring(r.content)

请注意,etree.fromstring会返回lxml.etree._Element。相比之下,etree.parse会返回lxml.etree._ElementTree。鉴于_Element,您可以使用_ElementTree方法获取getroottree

xmlTree = xmlObject.getroottree()