python lxml保存不工作

时间:2014-08-29 22:55:13

标签: python xml lxml

我有以下脚本 -

count = 1
for line in temp:
    if (str(count) + '=') in line:
        job = re.findall(re.escape('=')+"(.*)",line)[0]

        fullsrcurl = self.srcjson + '?format=xml&jobname=' + job
        srcfile = urllib2.urlopen(fullsrcurl)
        srcdoc = etree.parse(srcfile)
        srcdata = etree.tostring(srcdoc, pretty_print=True)
        srcjobmst_id = srcdoc.xpath('//jobmst_id/text()')[0]
        srcxml = 'c:\\temp\\deployments\\%s\\%s.xml' % (source_env, srcjobmst_id)
        srcxmlsave = open(srcxml, 'w')
        srcxmlsave.write(srcdata)
        srcxmlsave.close

        fulldsturl = self.targetjson + '?format=xml&jobname=' + job
        dstfile = urllib2.urlopen(fulldsturl)
        dstdoc = etree.parse(dstfile)
        dstdata = etree.tostring(dstdoc, pretty_print=True)
        dstjobmst_id = dstdoc.xpath('//jobmst_id/text()')[0]
        dstxml = 'c:\\temp\\deployments\\%s\\%s.xml' % (target_env, dstjobmst_id)
        dstxmlsave = open(dstxml, 'w')
        dstxmlsave.write(dstdata)
        dstxmlsave.close

        print "Job = " + job
        count += 1

它在2个环境中点击2个独立的API,但数据几乎完全相同。源工作正常,一旦它尝试对目标数据做任何事情我得到了跟随错误 -

Traceback (most recent call last):
  File "S:\Operations\Tidal\deployment\deployv2.py", line 213, in <module>
    main()
  File "S:\Operations\Tidal\deployment\deployv2.py", line 209, in main
    auto_deploy.deploy()
  File "S:\Operations\Tidal\deployment\deployv2.py", line 173, in deploy
    dstdoc = etree.parse(dstfile)
  File "lxml.etree.pyx", line 3239, in lxml.etree.parse (src\lxml\lxml.etree.c:6
9970)
  File "parser.pxi", line 1770, in lxml.etree._parseDocument (src\lxml\lxml.etre
e.c:102272)
  File "parser.pxi", line 1790, in lxml.etree._parseFilelikeDocument (src\lxml\l
xml.etree.c:102531)
  File "parser.pxi", line 1685, in lxml.etree._parseDocFromFilelike (src\lxml\lx
ml.etree.c:101457)
  File "parser.pxi", line 1134, in lxml.etree._BaseParser._parseDocFromFilelike
(src\lxml\lxml.etree.c:97084)
  File "parser.pxi", line 582, in lxml.etree._ParserContext._handleParseResultDo
c (src\lxml\lxml.etree.c:91290)
  File "parser.pxi", line 683, in lxml.etree._handleParseResult (src\lxml\lxml.e
tree.c:92476)
  File "parser.pxi", line 622, in lxml.etree._raiseParseError (src\lxml\lxml.etr
ee.c:91772)
lxml.etree.XMLSyntaxError: Extra content at the end of the document, line 4, col
umn 1

所以目标/目标xml必须有不同之处,但我很难理解什么。当我在浏览器中查看这两个值时,除了一些值(jobmst_id)之外,它们是相同的

2 个答案:

答案 0 :(得分:1)

您没有关闭文件。将srcxmlsave.close更改为srcxmlsave.close()或使用上下文管理器,如

with open(srcxml, 'w') as srcxmlsave:
    srcxmlsave.write(srcdata)

答案 1 :(得分:0)

如果有人在将来遇到这样的问题,我发现了问题,它与lxml或xml生成无关。我的源环境已使用mod_wsgi进行了生产,但目标环境仍在使用runserver。

我猜编码中的某些东西会破坏目标。我只是生产了目标环境并且工作正常。