有一个简单的xml
<?xml version="1.0" encoding="UTF-8" ?>
<root>
<child>abc</child>
</root>
我想从文件中解析它,这很有效:
with open('tst.xml') as test_xml:
for _, element in lxml.etree.iterparse(test_xml, tag='child'):
print element.text # prints abc as expected
但是,我尝试修改脚本然后允许它从文件或stdin
解析xml并且没有成功:
fi = fileinput.input('tst.xml')
for _, element in lxml.etree.iterparse(fi, tag='child'):
print element.text
# File "iterparse.pxi", line 371, in lxml.etree.iterparse.__init__ (src/lxml/lxml.etree.c:97283)
# File "apihelpers.pxi", line 1411, in lxml.etree._encodeFilename (src/lxml/lxml.etree.c:22515)
# TypeError: Argument must be string or unicode.
我不确定我做错了什么。 FileInput对象在python中不是类文件对象吗?
答案 0 :(得分:1)
如果没有深入调查,似乎异常的原因是FileInput
类不提供read
方法。
为了实现我的目标,我现在最终编写了自己的包装器:
class FileInput(object):
def __init__(self, filename=None, *args, **kwargs):
self.file = open(filename, *args, **kwargs) if filename and filename != "-" else sys.stdin
def __enter__(self):
return self.file
def __exit__(self, type, value, traceback):
if self.file is not sys.stdin:
self.file.close()
def __getattr__(self, name):
return getattr(self.file, name)
我会等待更好的答案。
答案 1 :(得分:0)
你不应该尝试使用fileinput
模块,而是直接这样做:
if filename == '-': # or, if we don't have a filename argument
f = sys.stdin
else:
f = open(filename, 'r')