如何将... dict的dict的字典列表解析到数据帧?

时间:2019-04-19 08:24:41

标签: python-3.x pandas dictionary

我有一个字典列表...基本上,这只是JSON的重要组成部分。这里看起来像列表中的一个字典:

{'id': 391257, 'from_id': -1, 'owner_id': -1, 'date': 1554998414, 'marked_as_ads': 0, 'post_type': 'post', 'text': 'Весна — время обновлений. Очищаем балконы от старых лыж и API от устаревших версий: уже скоро запросы к API c версией ниже 5.0 перестанут поддерживаться.\n\nОжидаемая дата изменений: 15 мая 2019 года. \n\nПодробности в Roadmap: https://vk.com/dev/version_update_2.0', 'post_source': {'type': 'vk'}, 'comments': {'count': 91, 'can_post': 1, 'groups_can_post': True}, 'likes': {'count': 182, 'user_likes': 0, 'can_like': 1, 'can_publish': 1}, 'reposts': {'count': 10, 'user_reposted': 0}, 'views': {'count': 63997}, 'is_favorite': False}

我想将每个字典转储到框架中。如果我愿意

data = pandas.DataFrame(list_of_dicts)

我得到一个只有两列的框架:第一列包含键,另一列包含数据,如下所示: enter image description here

我尝试循环执行:

for i in list_of_dicts:
    tmp = pandas.DataFrame().from_dict(i)
    data = pandas.concat([data, tmp])
    print(i)

但是我遇到ValueError:

Traceback (most recent call last):
  File "/home/keddad/PycharmProjects/vk_group_parse/Data Grabber.py", line 68, in <module>
    main()
  File "/home/keddad/PycharmProjects/vk_group_parse/Data Grabber.py", line 61, in main
    tmp = pandas.DataFrame().from_dict(i)
  File "/home/keddad/anaconda3/envs/vk_group_parse/lib/python3.7/site-packages/pandas/core/frame.py", line 1138, in from_dict
    return cls(data, index=index, columns=columns, dtype=dtype)
  File "/home/keddad/anaconda3/envs/vk_group_parse/lib/python3.7/site-packages/pandas/core/frame.py", line 392, in __init__
    mgr = init_dict(data, index, columns, dtype=dtype)
  File "/home/keddad/anaconda3/envs/vk_group_parse/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 212, in init_dict
    return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
  File "/home/keddad/anaconda3/envs/vk_group_parse/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 51, in arrays_to_mgr
    index = extract_index(arrays)
  File "/home/keddad/anaconda3/envs/vk_group_parse/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 320, in extract_index
    raise ValueError('Mixing dicts with non-Series may lead to '
ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.

在此之后,如何获得一个帖子的数据框(列表中的一个字典就是一个帖子),并且其中的所有数据都作为列?

2 个答案:

答案 0 :(得分:1)

不太确定您要做什么,但是您的意思是这样的吗?

您只需打印数据框即可看到数据内部。或者,您可以通过以下代码打印每张照片。

data = pandas.DataFrame(list_of_dicts)
print(data)

for i in data.loc[:, data.columns]:
    print(data[i])

答案 1 :(得分:1)

我无法确切地找出export function getAdalConfig() { return { tenant: 'de08ccD7-19b9-427d-9fe8-edf254300ca7', clientId: '828002a4-149f-478c-a318-933ad52ererf', redirectUri: window.location.origin, endpoints: { "https://xxx.azurewebsites.net/api/":"828002a4-149f-478c-a318-233456" }, navigateToLoginRequestUrl: false, cacheLocation: 'localStorage', expireOffsetSeconds: 600 }; } ,但我认为您只需要做df以及当前(看来)的所有数据即可:

reset_index

如果您想将df.reset_index(inplace=True) 作为列,则另一件事:

keys

在for循环中:

df = pd.Dataframe.from_dict(orient='columns')  
# or try `index` in columns if you don't get desired results