我想知道是否有一种pandas
方式(内置的或通常更好的方式)将一列记录(其中的记录是List[dict]
)嵌套到DataFrame
中。 / p>
样本数据:
import pandas as pd
expected = pd.DataFrame({
'A': [1, 1, 2],
'asset_id': ["aaa", "AAA", "bbb"],
'another_prop': [2, 3, 4]
})
df = pd.DataFrame({
'A':[1,2],
'B':[
[
{"asset_id": "aaa", "another_prop": 2},
{"asset_id": "AAA", "another_prop": 4}
],
[
{"asset_id": "bbb", "another_prop": 3}
]
]
})
我的尝试:
def unnest_records(df: pd.DataFrame, col: str) -> pd.DataFrame:
""" Unnests a column of records into a DataFrame."""
df_unnested = df.explode(col) # unnest records
records = df_unnested.pop(col) # 1 row per record
return pd.concat([df_unnested.reset_index(drop=True), pd.io.json.json_normalize(records)], axis=1)
输出:
>>> unnest_records(df, "B")
A asset_id another_prop
0 1 aaa 2
1 1 AAA 4
2 2 bbb 3
答案 0 :(得分:3)
IIUC explode
pd.Series
和set_index
df1 = df.set_index('A')['B'].explode().apply(pd.Series).reset_index()
A asset_id another_prop
0 1 aaa 2
1 1 AAA 4
2 2 bbb 3
或@anky指出:
s = df['B'].explode()
df = df[['A']].join(pd.DataFrame(s.tolist(),index=s.index))
print(df)
A asset_id another_prop
0 1 aaa 2
0 1 AAA 4
1 2 bbb 3
答案 1 :(得分:1)
您也可以这样做
import itertools as it
pd.DataFrame(it.chain(*df.B))