pandas.DataFrame.explode
如何工作?
在文档中:
https://pandas.pydata.org/pandas-docs/version/0.25/reference/api/pandas.DataFrame.explode.html
df = pd.DataFrame({'A': [[1, 2, 3], 'foo', [], [3, 4]], 'B': 1}) display(df) print(df.columns) print(df.dtypes) df.explode('A')
工作正常。但是对于我的数据,它失败了,但有一个关键的例外。 我的数据最初看起来像这样:
具有以下类型:
print(foo.columns)
print(foo.dtypes)
Index(['model', 'id_min_days_cutoff'], dtype='object')
model object
id_min_days_cutoff int64
dtype: object
其中model
是使用statsmodels回归使用的:
model.summary2().tables[1]
致电时: df.explode('model')
它失败并显示:
KeyError: 0
尝试重现此内容:
df_json = df.to_json()
# now load it again for SF purposes
df_json = '{"model":{"0":{"Coef.":{"ALQ_15PLUS_perc":95489.7866599741,"AST_perc":-272.9213162565,"BEV_UNTER15_perc":6781.448845533,"BEV_UEBER65_perc":-46908.2889142205},"Std.Err.":{"ALQ_15PLUS_perc":1399665.9788843254,"AST_perc":1558.1286516172,"BEV_UNTER15_perc":2027111.8764156068,"BEV_UEBER65_perc":1230965.9812726702},"z":{"ALQ_15PLUS_perc":0.0682232676,"AST_perc":-0.1751596802,"BEV_UNTER15_perc":0.0033453747,"BEV_UEBER65_perc":-0.038106893},"P>|z|":{"ALQ_15PLUS_perc":0.9456079052,"AST_perc":0.8609541651,"BEV_UNTER15_perc":0.9973307821,"BEV_UEBER65_perc":0.9696024555},"[0.025":{"ALQ_15PLUS_perc":-2647805.1223393031,"AST_perc":-3326.7973567063,"BEV_UNTER15_perc":-3966284.8215624653,"BEV_UEBER65_perc":-2459557.2784026605},"0.975]":{"ALQ_15PLUS_perc":2838784.6956592514,"AST_perc":2780.9547241933,"BEV_UNTER15_perc":3979847.7192535317,"BEV_UEBER65_perc":2365740.7005742197}},"1":{"Coef.":{"ALQ_15PLUS_perc":-140539.5196612777,"AST_perc":142.579413527,"BEV_UNTER15_perc":-45288.5612893498,"BEV_UEBER65_perc":-152106.9841374909},"Std.Err.":{"ALQ_15PLUS_perc":299852250.9155113101,"AST_perc":24013.7007484301,"BEV_UNTER15_perc":417010365.7919532657,"BEV_UEBER65_perc":171876588.9403209388},"z":{"ALQ_15PLUS_perc":-0.0004686959,"AST_perc":0.0059374194,"BEV_UNTER15_perc":-0.000108603,"BEV_UEBER65_perc":-0.0008849779},"P>|z|":{"ALQ_15PLUS_perc":0.9996260348,"AST_perc":0.9952626525,"BEV_UNTER15_perc":0.9999133474,"BEV_UEBER65_perc":0.9992938899},"[0.025":{"ALQ_15PLUS_perc":-587840151.997330904,"AST_perc":-46923.4091889186,"BEV_UNTER15_perc":-817370586.6933914423,"BEV_UEBER65_perc":-337024031.0927618742},"0.975]":{"ALQ_15PLUS_perc":587559072.9580082893,"AST_perc":47208.5680159725,"BEV_UNTER15_perc":817280009.5708128214,"BEV_UEBER65_perc":336719817.1244869232}}},"id_min_days_cutoff":{"0":2,"1":3}}'
pd.read_json(df_json).explode('model')
失败:
KeyError: 0
尝试使用以下方法之一找到替代方法:How to unnest (explode) a column in a pandas DataFrame?选择2.1
pd.DataFrame({'model':np.concatenate(df_json.model.values)},
index=df_json.index.repeat(ddf_jsonf.model.str.len()))
但这失败了:
ValueError: zero-dimensional arrays cannot be concatenated
当将其应用于原始df时,不要从JSON中读取:
Exception: Data must be 1-dimensional
我如何才能使嵌套/爆炸正常工作?