我有一个带有多个标头的pandas数据框。我想知道如何将其转换为嵌套目录列表。熊猫数据框中的每一行将是列表中的嵌套字典。
这是一个例子
#Creaet an example multiheader dataframe
col =['id','x, single room','x, double room','y, single room','y, double room' ]
df = pd.DataFrame([[1,2,3,4,5], [3,4,7,5,3]], columns=col)
a = df.columns.str.split(', ', expand=True).values
#swap values in NaN and replace NAN to ''
df.columns = pd.MultiIndex.from_tuples([('', x[0]) if pd.isnull(x[1]) else x for x in a])
df
结果
x y
id single room double room single room double room
0 1 2 3 4 5
1 3 4 7 5 3
这是我想转换为嵌套字典列表的数据框。所以这是理想的结果
[{'id': 1,
'x': {'double room': 3, 'single room': 2},
'y': {'double room': 5, 'single room': 4}},
{'id': 3,
'x': {'double room': 7, 'single room': 4},
'y': {'double room': 3, 'single room': 5}}]
在下面的代码中,我直接创建此列表。
firstDict = { 'id':1, 'x':{'single room':2, 'double room':3}, 'y':{'single room':4, 'double room':5} }
secondDict = { 'id':3, 'x':{'single room':4, 'double room':7}, 'y':{'single room':5, 'double room':3} }
dictList = []
dictList.append( firstDict )
dictList.append( secondDict )
dictList
[{'id': 1,
'x': {'double room': 3, 'single room': 2},
'y': {'double room': 5, 'single room': 4}},
{'id': 3,
'x': {'double room': 7, 'single room': 4},
'y': {'double room': 3, 'single room': 5}}]
因此,总而言之,我如何将数据帧df
转换为dictList
。
编辑:
这是一个最小的示例,我正在寻找的解决方案应该推广到更长的标头。
答案 0 :(得分:6)
我认为这样做没有直接的方法,也就是说,您可以使用stack + to_dict以及随后的一些后处理:
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
public class MapGenerator : MonoBehaviour
{
public Transform tilePrefab; //This is the part which I want to be random
public Vector2 mapSize;
private void Start()
{
GenerateMap();
}
public void GenerateMap()
{
for (int x = 0; x < mapSize.x; x++)
{
for (int y = 0; y < mapSize.y; y++)
{
Vector3 tilePosition = new Vector3(-mapSize.x / 2 + 0.5f + x, 0,-mapSize.y / 2 + 0.5f + y);
Transform newTile = Instantiate(tilePrefab, tilePosition, Quaternion.Euler(Vector3.right * 360)) as Transform;
}
}
}
}
输出
# prepare the DataFrame
df = df.set_index(('', 'id')).stack(level=0)
df.index.names = ['id', None]
# convert to a dicts of dicts
d = {}
for (idi, key), values in df.to_dict('index').items():
d.setdefault(idi, {}).update({key: values})
# convert d to list of dicts
result = [{'id': k, **values} for k, values in d.items()]
答案 1 :(得分:1)
不确定标头的数量可以多长时间,目前它处于易于手动编码的状态,如下所示-
dct = []
for x in df.values:
nd = {
"id": x[0],
"x": {
"single room": x[1],
"double room": x[2]
},
"y": {
"single room": x[3],
"double room": x[4]
}
}
dct.append(nd)
请让我知道是否有大量的标头,并且代码需要在不显式键入的情况下处理它们。
答案 2 :(得分:1)
像这样吗?
import pandas as pd
col =['id','x, single room','x, double room','y, single room','y, double room' ]
df = pd.DataFrame([[1,2,3,4,5], [3,4,7,5,3]], columns=col)
a = df.columns.str.split(', ', expand=True).values
#swap values in NaN and replace NAN to ''
df.columns = pd.MultiIndex.from_tuples([('', x[0]) if pd.isnull(x[1]) else x for x in a])
print(df)
dict_list = []
for index, row in df.iterrows():
d = {}
# _dict [row["id"]]
print(type(row), row)#, row.select(1, axis = 0) )
d["id"] = row[0]
d["x"] = {'single room':row[1], 'double room':row[1]}
d["y"] = {'single room':row[3], 'double room':row[4]}
dict_list.append(d)
print(dict_list)
输出:
[{'id': 1,
'x': {'single room': 2, 'double room': 2},
'y': {'single room': 4, 'double room': 5}
},
{'id': 3,
'x': {'single room': 4, 'double room': 4},
'y': {'single room': 5, 'double room': 3}
}
]
答案 3 :(得分:1)
您可以使用
l = []
d = None
for i, row in df.iterrows():
for (i1,i2),v in row.iteritems():
if i2 == 'id':
d = {i2:v}
l.append(d)
continue
try:
d[i1][i2]=v
except KeyError:
d[i1] = {i2:v}
或者您可以对预期结果稍作修改:
from collections import defaultdict
l =[]
for i, row in df.iterrows():
d = defaultdict(dict)
for (i1,i2),v in row.iteritems():
if i2 == 'id':
d[i2][v]=v
else:
d[i1][i2]=v
l.append(dict(d))
输出:
[{'id': {1: 1},
'x': {'single room': 2, 'double room': 3},
'y': {'single room': 4, 'double room': 5}},
{'id': {3: 3},
'x': {'single room': 4, 'double room': 7},
'y': {'single room': 5, 'double room': 3}}]
答案 4 :(得分:1)
我喜欢接受的解决方案,但是在这里我的两个选择都没有堆叠。
此解决方案简单明了,但随着列数的增加,重复和容易出错:
lst = [{'id': d[('', 'id')],
'x': {'single room': d[('x', 'single room')], 'double room': d[('x', 'double room')]},
'y': {'single room': d[('y', 'single room')], 'double room': d[('y', 'double room')]},}
for d in df.to_dict('records')
]
让我们尝试使其更具可扩展性,您可以从Arbitrarily nested dictionary from tuples获得nest
函数:
def nest(d: dict) -> dict:
result = {}
for key, value in d.items():
target = result
for k in key[:-1]:
target = target.setdefault(k, {})
target[key[-1]] = value
return result
但是对于('', id)
,我们需要不同的行为:
def nest_m(d: dict) -> dict:
result = {}
for key, value in d.items():
if key == ('', 'id'):
result['id'] = value
else:
target = result
for k in key[:-1]:
target = target.setdefault(k, {})
target[key[-1]] = value
return result
最后一行:
lst = [nest_m(d) for d in df.to_dict('records')]
输出:
[{'id': 1,
'x': {'single room': 2, 'double room': 3},
'y': {'single room': 4, 'double room': 5}},
{'id': 3,
'x': {'single room': 4, 'double room': 7},
'y': {'single room': 5, 'double room': 3}}]