我正在读取.parquet文件,该文件的字符串列如下:
{"circuitStatus": "CREATED", "startedAt": "2019-02-11T16:07:31.121Z",
"event": "CIRCUIT_CREATION"},
{"circuitStatus": "RUNNING", "startedAt": "2019-02-11T16:07:32.147Z",
"diff": [], "event": "CIRCUIT_UPDATED"}]}
我想取消嵌套此列,但由于它是字符串而失败。
这是原始数据框:
我在Jupyter Notebook中手动执行以下操作:
df =pd.concat([df.drop(['B'], axis=1), df['B'].apply(pd.Series)], axis=1)
但仅当列不是字符串时:
df = pd.DataFrame({'A':'7e1ab727-a9e9-4c00-b6dc-9e65e91b9e4f','B':[{"circuitStatus": "CREATED", "startedAt": "2019-02-11T16:07:31.121Z", "event": "CIRCUIT_CREATION"}, {"circuitStatus": "RUNNING", "startedAt": "2019-02-11T16:07:32.147Z", "diff": [], "event": "CIRCUIT_UPDATED"}]})
df2 = pd.DataFrame({'A':'22222222-a9e9-4c00-b6dc-9e65e91b9e4f','B':[{"circuitStatus": "CREATED",` "startedAt": "2019-02-11T16:07:31.121Z", "event": "CIRCUIT_CREATION"}, {"circuitStatus": "RUNNING", "startedAt": "2019-02-11T16:07:32.147Z", "diff": [], "event": "CIRCUIT_UPDATED"}]})
df3 = pd.concat([df, df2])
df3 =pd.concat([df3.drop(['B'], axis=1), df3['B'].apply(pd.Series)], axis=1)
df3
当我尝试从.parquet中读取相同的代码时,它不会抛出错误,但不幸的是它没有完成