Question

我有一个Pandas DataFrame，大约有500000行，格式如下：

**ID  Name  Tags**
4345  Bill  {'circle:blue', 'background:orange', 'Type':12}

对于更直接的数据分析，我想转换为：

**ID   Name  Key         Value** 
4345   Bill  Circle      Blue
4345   Bill  Background  Orange
4345   Bill  Type        12

我找到了一个可以拆分每行一个键/值的答案： Python Pandas: How to split a sorted dictionary in a column of a dataframe，但我已经失败地将其扩展到执行上述要求。

我可以通过一些标准循环来管理这个问题，但我希望有一种优雅高效的Pandas方法吗？

Answer 1

基于this answer，你可以做类似的事情：

>>> df_tags = df.apply(lambda x: pd.Series(x['Tags']),axis=1).stack().reset_index(level=1, drop=False)
>>> df_tags.columns = ['Key', 'Value']
>>> df_tags
          Key   Value
0        Type      12
0  background  orange
0      circle    blue
>>> df.drop('Tags', axis=1).join(df_tags)
     ID  Name         Key   Value
0  4345  Bill        Type      12
0  4345  Bill  background  orange
0  4345  Bill      circle    blue

Python Pandas，一个dict列，为每个键/值对创建新行

1 个答案: