我有以下形式的pandas DataFrame:
col1 col2
1 a {hu, fdf, ko, dss}
2 b {sdsjdn, lk}
3 c {sds, aldj, dhva}
现在我想将设置值拆分为多行,使其如下所示:
col1 col2
1 a hu
2 a fdf
3 a ko
4 a dss
5 b sdsjdn
6 b lk
7 c sds
8 c aldj
9 c dhva
任何人都有任何见解如何做到这一点?
答案 0 :(得分:3)
您需要numpy.repeat
来创建新的重复列,并按chain.from_iterable
展平另一个设置列:
df = pd.DataFrame({ 'col1': ['a','b','c'],
'col2': [set({'hu', 'fdf', 'ko', 'dss'}),
set({'sdsjdn', 'lk'}),
set({'sds', 'aldj', 'dhva'})]})
print(df)
col1 col2
0 a {hu, dss, ko, fdf}
1 b {lk, sdsjdn}
2 c {dhva, aldj, sds}
from itertools import chain
df1 = pd.DataFrame({
"col1": np.repeat(df.col1.values, df.col2.str.len()),
"col2": list(chain.from_iterable(df.col2))})
print (df1)
col1 col2
0 a hu
1 a dss
2 a ko
3 a fdf
4 b lk
5 b sdsjdn
6 c dhva
7 c aldj
8 c sds