更新：

Question

如何将键/值列表中的嵌套字典解开成列？我尝试了不同的组合来解决将嵌套字典转换为pandas数据框架的问题。从堆栈中看，我还不能完全解决问题。

样本数据：

test = {
    'abc': {
        'company_id': '123c',
        'names': ['Oscar', 'John Smith', 'Smith, John'],
        'education': ['MS', 'BS']
    },
    'DEF': {
        'company_id': '124b',
        'names': ['Matt B.'],
        'education': ['']
    }
}

尝试：

1）

pd.DataFrame(list(test.items())) # not working entirely - creates {dictionary in col '1'}

2）

df = pd.concat({
        k: pd.DataFrame.from_dict(v, 'index') for k, v in test.items()
    }, 
    axis=0)

df2 = df.T
df2.reset_index() # creates multiple columns

所需的输出：

Answer 1

更新：

随着Caused by: org.h2.jdbc.JdbcSQLNonTransientException: Ошибка при создании файла "/C:" Error while creating file "/C:" [90062-199]的发布和pandas 0.25的添加，现在变得更容易了很多：

explode

熊猫前0.25：

这不是很精简，但是这是一个相当复杂的转换。受this blog post的启发，我使用了两次单独的迭代来解决该问题，即将列表列变成一系列，然后使用frame = pd.DataFrame(test).T frame = frame.explode('names').set_index( ['company_id', 'names'], append=True).explode( 'education').reset_index( ['company_id', 'names'] )转换DataFrame。

melt

结果：

import pandas as pd

test = {
    'abc': {
        'company_id': '123c',
        'names': ['Oscar', 'John Smith', 'Smith, John'],
        'education': ['MS', 'BS']
    },
    'DEF': {
        'company_id': '124b',
        'names': ['Matt B.'],
        'education': ['']
    }
}

frame = pd.DataFrame(test).T

names = frame.names.apply(pd.Series)
frame = frame.merge(
    names, left_index=True, right_index=True).drop('names', axis=1)
frame = frame.reset_index().melt(
    id_vars=['index', 'company_id', 'education'],
    value_name='names').drop('variable', axis=1).dropna()

education = frame.education.apply(pd.Series)
frame = frame.merge(
    education, left_index=True, right_index=True).drop('education', axis=1)
frame = frame.melt(
    id_vars=['index', 'company_id', 'names'],
    value_name='education').drop(
    'variable', axis=1).dropna().sort_values(by=['company_id', 'names'])

frame.columns = ['set_name', 'company_id', 'names', 'education']

print(frame)

熊猫-从字典中的嵌套键值和嵌套列表创建数据框

1 个答案:

更新：

熊猫前0.25：