有一个名为core-site.xml的文件
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/centos/hadoop_tmp/tmp</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://test:9000</value>
</property>
</configuration>
我如何在python中得到这样的字典:
{'hadoop.tmp.dir': 'file:/home/centos/hadoop/tmp', 'fs.defaultFS': 'hdfs://test:9000'}
答案 0 :(得分:2)
您应该使用ElementTree python库,该库可在此处找到: https://docs.python.org/2/library/xml.etree.elementtree.html
首先,您需要将.xml文件传递到ElementTree库
import xml.etree.ElementTree as ET
tree = ET.parse('core-site.xml')
root = tree.getroot()
完成后,您就可以开始使用root
对象来解析XML文档
for property in root.findall('property'):
在此循环中,您可以开始从属性中提取名称和值
for entry in root.findall('property'):
name = entry.find('name').text
value = entry.find('value').text
print(name)
print(value)
您要将其添加到字典中,该字典应该很简单
configuration = dict()
for entry in root.findall('property'):
name = entry.find('name').text
value = entry.find('value').text
configuration[name] = value
然后您应该拥有一个字典,其中包含所有XML配置
import xml.etree.ElementTree as ET
tree = ET.parse('core-site.xml')
root = tree.getroot()
configuration = dict()
for entry in root.findall('property'):
name = entry.find('name').text
value = entry.find('value').text
configuration[name] = value
print(configuration)
答案 1 :(得分:0)
这个问题已经有了一个可以接受的答案,但是由于我对此进行了评论,所以我想举一个使用我建议的模块之一的示例。
xml = '''<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/centos/hadoop_tmp/tmp</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://test:9000</value>
</property>
</configuration>'''
import xmltodict
# Load the xml string into a test object
test = xmltodict.parse(xml)
# Instantiate a temporary dictionary where we will store the parsed data
temp_dict = {}
# Time to parse the resulting structure
for name in test:
# Check that we have the needed 'property' key before doing any processing on the leaf
if 'property' in test[name].keys():
# For each property leaf
for property in test[name]['property']:
# If the leaf has the stuff you need to save, print it
if 'name' in property.keys():
print('Found name', property['name'])
if 'value' in property.keys():
print('With value', property['value'])
# And then save it to the temporary dictionary in the form you need
# Do note that if you have duplicate "name" strings, only the last "value" will be saved
temp_dict.update({property['name']: property['value']})
print(temp_dict)
这是输出
找到名称hadoop.tmp.dir
带有值文件:/ home / centos / hadoop_tmp / tmp
找到名称fs.defaultFS
具有值hdfs:// test:9000
{'hadoop.tmp.dir':'file:/ home / centos / hadoop_tmp / tmp','fs.defaultFS':'hdfs:// test:9000'}