我想将其中字典中的值分为一列。假设我有数百万行,如何在不使用for循环的情况下做到这一点?
当前,这是我正在做的:
s = {"alpha":['apple','ball']*300,"data":[{"source":'CNN','time':'two'},{"license":'CNN','time':'two'}]*300}
pp=pd.DataFrame(s)
start = 0
st=pd.DataFrame()
intermediate =100
while start< len(pp):
few = pp.loc[start:intermediate,:]
# print(few)
few_edges1=pd.concat([few.drop(['data'], axis=1), few['data'].apply(pd.Series)], axis=1)
st=pd.concat([st,few_edges1])
start = intermediate+1
intermediate = intermediate+100
# if start % 500000==0:
print(st.shape)
st.head()
请注意,词典可能没有相同的键。在此示例中,只有3个不同的键,但实际数据中可能有数十个键。
谢谢
山姆
答案 0 :(得分:3)
IIUC,只是:
st = (pp.drop('data', axis=1)
.join(pd.DataFrame.from_records(pp['data'].values))
)
输出(st.head()
):
alpha source time license
0 apple CNN two NaN
1 ball NaN two CNN
2 apple CNN two NaN
3 ball NaN two CNN
4 apple CNN two NaN