我有一个数据框,其中一列包含不规则的元组列表。元组都将具有相同的长度,只是列表不均匀。我想在框架中融化此列,以便将新列追加到现有列中,并复制行。像这样:
df
name id list_of_tuples
0 john doe abc-123 [('cat',100,'xyz-123'),('cat',96,'uvw-456')]
1 bob smith def-456 [('dog',98,'rst-789'),('dog',97,'opq-123'),('dog',95,'lmn-123')]
2 bob parr ghi-789 [('tree',100,'ijk-123')]
df_new
name id val_1 val_2 val_3
0 john doe abc-123 cat 100 xyz-123
1 john doe abc-123 cat 96 uvw-456
2 bob smith def-456 dog 98 rst-789
3 bob smith def-456 dog 97 opq-123
4 violet parr def-456 dog 95 lmn-123
5 violet parr ghi-789 tree 100 ijk-123
对于我当前的方法,我正在创建一个新的数据框,其中使用了itertools的链功能,但是我想摆脱创建另一个数据框并将其重新合并到“ id”列中的麻烦。
这是我当前的代码:
df_new = pd.DataFrame(list(chain.from_iterable(df.matches)),columns=['val_1','val_2','val_3']).reset_index(drop=True)
df_new['id'] = np.repeat(df.id.values, df['list_of_tuples'].str.len())
答案 0 :(得分:2)
嵌套您的列表,然后我们进行concat
s=df.list_of_tuples
pd.concat([pd.DataFrame({'id':df.id.repeat(s.str.len())}).reset_index(drop=True),pd.DataFrame(np.concatenate(s.values))],axis=1)
Out[118]:
id 0 1 2
0 abc-123 cat 100 xyz-123
1 abc-123 cat 96 uvw-456
2 def-456 dog 98 rst-789
3 def-456 dog 97 opq-123
4 def-456 dog 95 lmn-123
5 ghi-789 tree 100 ijk-123
答案 1 :(得分:1)
让apply
与pd.Series
一起使用:
df.set_index('id').list_of_tuples #Set id as index and select list_of_tuples column
.apply(pd.Series) #apply pd.series to separate elements of list
.stack() #stack the elements vertically
.apply(pd.Series) #apply pd.Series to separate elements of tuples
.add_prefix('val_') #add prefix of val_ to all columns
.reset_index() #Reset index to move id back into frame as column
.drop('level_1', axis=1) #Drop not need level_1 column from stack
输出:
id val_0 val_1 val_2
0 abc-123 cat 100 xyz-123
1 abc-123 cat 96 uvw-456
2 def-456 dog 98 rst-789
3 def-456 dog 97 opq-123
4 def-456 dog 95 lmn-123
5 ghi-789 tree 100 ijk-123
已编辑以处理向数据框添加“名称”的问题编辑:
df.set_index(['name','id']).list_of_tuples
.apply(pd.Series)
.stack()
.apply(pd.Series)
.add_prefix('val_')
.reset_index(level=-1,drop=True)
.reset_index()
输出:
name id val_0 val_1 val_2
0 John Doe abc-123 cat 100 xyz-123
1 John Doe abc-123 cat 96 uvw-456
2 Bob Smith def-456 dog 98 rst-789
3 Bob Smith def-456 dog 97 opq-123
4 Bob Smith def-456 dog 95 lmn-123
5 Bob Parr ghi-789 tree 100 ijk-123