我有一组通过for循环获得的python字典。我正在尝试将这些添加到Pandas Dataframe中。
名为output
的变量的输出
{'name':'Kevin','age':21}
{'name':'Steve','age':31}
{'name':'Mark','age':11}
我正在尝试将每个字典附加到单个Dataframe中。我尝试执行以下操作,但它仅添加了第一行。
df = pd.DataFrame(output)
任何人都可以建议哪里出了问题,并将所有词典添加到数据框中。
更新循环语句
以下代码有助于读取xml并将其转换为数据框。现在,我看到我能够遍历多个xml文件并为每个xml文件创建字典。我试图看看如何将这些字典中的每一个添加到单个Dataframe中:
def f(elem, result):
result[elem.tag] = elem.text
cs = elem.getchildren()
for c in cs:
result = f(c, result)
return result
result = {}
for file in allFiles:
tree = ET.parse(file)
root = tree.getroot()
result = f(root, result)
print(result)
答案 0 :(得分:1)
您可以将每个字典追加到列表中,并最后调用DataFrame
构造函数:
out = []
for file in allFiles:
tree = ET.parse(file)
root = tree.getroot()
result = f(root, result)
out.append(result)
df = pd.DataFram(out)
答案 1 :(得分:1)
We can add these dicts to a list:
ds = []
for ...: # your loop
ds += [d] # where d is one of the dicts
When we have the list of dicts, we can simply use pd.DataFrame
on that list:
ds = [
{'name':'Kevin','age':21},
{'name':'Steve','age':31},
{'name':'Mark','age':11}
]
pd.DataFrame(ds)
Output:
name age
0 Kevin 21
1 Steve 31
2 Mark 11
Update: And it's not a problem if different dicts have different keys, e.g.:
ds = [
{'name':'Kevin','age':21},
{'name':'Steve','age':31,'location': 'NY'},
{'name':'Mark','age':11,'favorite_food': 'pizza'}
]
pd.DataFrame(ds)
Output:
age favorite_food location name
0 21 NaN NaN Kevin
1 31 NaN NY Steve
2 11 pizza NaN Mark
Update 2: Building up on our previous discussion in Python - Converting xml to csv using Python pandas we can do:
results = []
for file in glob.glob('*.xml'):
tree = ET.parse(file)
root = tree.getroot()
result = f(root, {})
result['filename'] = file # added filename to our results
results += [result]
pd.DataFrame(results)