我有一个数据框,应用了一些过滤器后,看起来像这样:
index A .... J
55 7 .... [{'sqlStatement': 'DELETE FROM Z WHERE D=2000', 'number': 200, 'time':3556, 'timestamp': 'Jun 13, 2017 5:41:22 PM' }, {'sqlStatement': 'DELETE FROM U WHERE Z=100', 'number': 450, 'time':8906, 'timestamp': 'Jun 13, 2017 5:49:22 PM'}, {'sqlStatement': 'DELETE FROM U WHERE Z=150', 'number': 270, 'time':9806, 'timestamp': 'Jun 13, 2017 5:58:45 PM'}]
193 7 .... [{'sqlStatement': 'DELETE FROM T WHERE F=98', 'number': 8043, 'time':463465, 'timestamp': 'Jun 13, 2017 6:01:22 PM' }, {'sqlStatement': 'DELETE FROM F WHERE A=98 AND Z=100 ', 'number': 9890, 'time':487569, 'timestamp': 'Jun 13, 2017 6:09:28 PM'}]
我需要将J列分隔为一个新的数据框。为此,我使用以下代码:
for i, (k, v) in enumerate (df['J'].items()):
df = pd.DataFrame(v)
我得到:
index sqlStatement number time timestamp
1 DELETE FROM Z WHERE D=2000 200 3556 Jun 13, 2017 5:41:22 PM
2 DELETE FROM U WHERE Z=100 450 8906 Jun 13, 2017 5:41:22 PM
3 DELETE FROM U WHERE Z=150 270 9806 Jun 13, 2017 5:58:45 PM
4 DELETE FROM T WHERE F=98 8043 463465 Jun 13, 2017 6:01:22 PM
5 DELETE FROM T WHERE F=98 AND Z=100 9890 487569 Jun 13, 2017 6:09:28 PM
问题是我想添加一列,其中包含生成这些新值的观测值的索引。 我想实现的是:
index sqlStatement number time timestamp old_index
1 DELETE FROM Z WHERE D=2000 200 3556 Jun 13, 2017 5:41:22 PM 55
2 DELETE FROM U WHERE Z=100 450 8906 Jun 13, 2017 5:41:22 PM 55
3 DELETE FROM U WHERE Z=150 270 9806 Jun 13, 2017 5:58:45 PM 55
4 DELETE FROM T WHERE F=98 8043 463465 Jun 13, 2017 6:01:22 PM 193
5 DELETE FROM T WHERE F=98 AND Z=100 9890 487569 Jun 13, 2017 6:09:28 PM 193
你能帮我吗?
答案 0 :(得分:1)
j = [[{'number': 200,
'sqlStatement': 'DELETE FROM Z WHERE D=2000',
'time': 3556,
'timestamp': 'Jun 13, 2017 5:41:22 PM'},
{'number': 450,
'sqlStatement': 'DELETE FROM U WHERE Z=100',
'time': 8906,
'timestamp': 'Jun 13, 2017 5:49:22 PM'},
{'number': 270,
'sqlStatement': 'DELETE FROM U WHERE Z=150',
'time': 9806,
'timestamp': 'Jun 13, 2017 5:58:45 PM'}],
[{'number': 8043,
'sqlStatement': 'DELETE FROM T WHERE F=98',
'time': 463465,
'timestamp': 'Jun 13, 2017 6:01:22 PM'},
{'number': 9890,
'sqlStatement': 'DELETE FROM F WHERE A=98 AND Z=100 ',
'time': 487569,
'timestamp': 'Jun 13, 2017 6:09:28 PM'}]]
df = pd.DataFrame({'J': j})
pandas.DataFrame.explode
pandas v0.25
df_explode = df.explode('J')
pd.Series
展开dicts
:df_explode = df_explode.J.apply(pd.Series)
df_explode.reset_index(inplace=True)
df_explode.rename(columns={'index': 'old_index'})
reset_index
和df_explode.index
将是原始索引。