我有以下内容:
pa = pd.DataFrame({'a':np.array([[1.,4.],[2.],[3.,4.,5.]]),
'b':np.array([[2.,5.],[3., 6.],[4.,5.,6.]])})
这将产生:
a b
0 [1.0, 4.0] [2.0, 5.0]
1 [2.0, 3.3] [3.0, 6.0]
2 [3.0, 4.0, 5.0] [4.0, 5.0, 6.0]
我尝试了各种技术将每个数组的项连接成一个新数组。
以这种方式的东西:
a b c
0 [1.0, 4.0] [2.0, 5.0] [1.0, 2.0]
1 [1.0, 4.0] [2.0, 5.0] [4.0, 5.0]
2 [2.0, 3.3] [3.0, 6.0] [2.0, 3.0]
3 [2.0, 3.3] [3.0, 6.0] [3.3, 6.0]
4 [3.0, 4.0, 5.0] [4.0, 5.0, 6.0] [3.0, 4.0]
5 [3.0, 4.0, 5.0] [4.0, 5.0, 6.0] [4.0, 5.0]
6 [3.0, 4.0, 5.0] [4.0, 5.0, 6.0] [5.0, 6.0]
如果有其他列,我可以将这些项更新为新创建的列。但我一直坚持到这个位置。
有人可以帮忙吗?
答案 0 :(得分:2)
使用zip
使用不需要的方法
pa['New']=[list(zip(x,y)) for x, y in zip(pa.a,pa.b)]
s=pa.New.str.len()
df=pd.DataFrame({'a':pa['a'].repeat(s),'b':pa['b'].repeat(s),'New':list(map(list,pa.New.sum()))})
df
New a b
0 [1.0, 2.0] [1.0, 4.0] [2.0, 5.0]
0 [4.0, 5.0] [1.0, 4.0] [2.0, 5.0]
1 [2.0, 3.0] [2.0, 3.3] [3.0, 6.0]
1 [3.3, 6.0] [2.0, 3.3] [3.0, 6.0]
2 [3.0, 4.0] [3.0, 4.0, 5.0] [4.0, 5.0, 6.0]
2 [4.0, 5.0] [3.0, 4.0, 5.0] [4.0, 5.0, 6.0]
2 [5.0, 6.0] [3.0, 4.0, 5.0] [4.0, 5.0, 6.0]
答案 1 :(得分:0)
def f(row):
return pd.Series(zip(row["a"], row["b"]))
mod = df.apply(f, 1).stack()
mod.index = mod.index.get_level_values(0)
df.merge(mod.to_frame(), left_index=True, right_index=True)
a b c
0 [1.0, 4.0] [2.0, 5.0] (1.0, 2.0)
0 [1.0, 4.0] [2.0, 5.0] (4.0, 5.0)
1 [2.0, 3.3] [3.0, 6.0] (2.0, 3.0)
1 [2.0, 3.3] [3.0, 6.0] (3.3, 6.0)
2 [3.0, 4.0, 5.0] [4.0, 5.0, 6.0] (3.0, 4.0)
2 [3.0, 4.0, 5.0] [4.0, 5.0, 6.0] (4.0, 5.0)
2 [3.0, 4.0, 5.0] [4.0, 5.0, 6.0] (5.0, 6.0)