如何将XML数据转换为以下格式的数据框。
<start>
<main index = '1', sub = 'english' >
<name value = '1', text = 'hi this is xxx' />
<name value = '2', text = 'isnt this funny' />
</main>
<main index = '2', sub = 'french'>
<name value = '1', text = 'Comment vas-tu' />
<name value = '2', text = 'sil vous plaît résoudre ce'>
</main>
</start>
预期的DataFrame:
mainindex namevalue text
A 1 hi this is xxx
A 2 isnt this funny
B 1 Comment vas-tu
B 2 sil vous plaît résoudre ce
答案 0 :(得分:1)
另一种方法:
saveFileName = 'yourOwnFileName.txt'
def main():
mainindex = None
with open('yourOwnXml.xml', 'r') as f_read:
with open(saveFileName, 'w') as f_write:
for line in f_read.readlines():
if '<main index' in line.strip():
mainindex = line.split('\'')[1]
if '<name value' in line.strip():
name_value = line.split('\'')[1]
text = line.split('\'')[3]
f_write.write('{mainindex} {namevalue} {text}\n'.format(mainindex=mainindex, namevalue=name_value, text=text))
if __name__ == '__main__':
main()
yourOwnFileName.txt
中的输出应为:
1 1 hi this is xxx
1 2 isnt this funny
2 1 Comment vas-tu
2 2 sil vous plaît résoudre ce
答案 1 :(得分:0)
喜欢BeautifulSoup吗?
data = """<start>
<main index = '1', sub = 'english' >
<name value = '1', text = 'hi this is xxx' />
<name value = '2', text = 'isnt this funny' />
</main>
<main index = '2', sub = 'french'>
<name value = '1', text = 'Comment vas-tu' />
<name value = '2', text = 'sil vous plaît résoudre ce'>
</main>
</start>"""
data = BeautifulSoup(data)
headers = ['mainIndex','nameValue','text']
dataframe = pd.DataFrame(columns=headers)
pos = 0
i = 0
for m in data.find_all('main'):
for name in m.find_all('name'):
d = []
d.append(chr(ord('A')+i))
d.append(name.get('value'))
d.append(name.get('text'))
dataframe.loc[pos] = d
pos+=1
i+=1
print(dataframe)
mainIndex nameValue text
0 A 1 hi this is xxx
1 A 2 isnt this funny
2 B 1 Comment vas-tu
3 B 2 sil vous plaît résoudre ce