我有两个具有相同行和列的数据透视表,我需要创建一个表,其值由单元格中具有相等行和列的comas分隔。
例如
表1
1 2 3 4
a t1a1 t1a2 t1a3 t1a4
b t1b1 t1b2 t1b3 t1b4
表2
1 2 3 4
a t2a1 t2a2 t2a3 t2a4
b t2b1 t2b2 t2b3 t2b4
我想:
表结果
1 2 3 4
a (t1a1,t2a1) (t1a2,t2a2) (t1a3,t2a3) (t1a4,t2a4)
b (t1b1,t2b1) (t1b2,t2b2) (t1b3,t2b3) (t1b4,t2b4)
concat函数返回
1 2 3 4 1 2 3 4
a t1a1 t1a2 t1a3 t1a4 t2a1 t2a2 t2a3 t2a4
b t1b1 t1b2 t1b3 t1b4 t2b1 t2b2 t2b3 t2b4
我在python中使用pandas库
谢谢
答案 0 :(得分:3)
如果需要字符串输出,您可以使用所有DataFrame
的concnecation:
df = '(' + df1 + ' , ' + df2 + ')'
#if numeric columns first cast to str
#df = '(' + df1.astype(str) + ' , ' + df2.astype(str) + ')'
print (df)
1 2 3 4
a (t1a1 , t2a1) (t1a2 , t2a2) (t1a3 , t2a3) (t1a4 , t2a4)
b (t1b1 , t2b1) (t1b2 , t2b2) (t1b3 , t2b3) (t1b4 , t2b4)
如果需要元组:
df = pd.concat([df1, df2], keys=['a','b']) \
.groupby(level=1).agg(lambda x: tuple(x))
print (df)
1 2 3 4
a (t1a1, t2a1) (t1a2, t2a2) (t1a3, t2a3) (t1a4, t2a4)
b (t1b1, t2b1) (t1b2, t2b2) (t1b3, t2b3) (t1b4, t2b4)
答案 1 :(得分:1)
这是一个简单的方法
df1 = pd.DataFrame(np.array([
['a','t1a1','t1a2','t1a3','t1a4'],
['b','t1b1','t1b2','t1b3','t1b4'],
['c','t1c1','t1c2','t1c3','t1c4']]),
columns=['name', 'attr11', 'attr12', 'attr13', 'attr14'])
df2 = pd.DataFrame(np.array([
['a','t2a1','t2a2','t2a3','t2a4'],
['b','t2b1','t2b2','t2b3','t2b4'],
['c','t2c1','t2c2','t2c3','t2c4']]),
columns=['name', 'attr21', 'attr22', 'attr23', 'attr24'])
df3 =pd.merge(df1,df2,on='name')
df3["attr1"] = '('+ df3['attr11']+ ',' +df3['attr21'] +')'
df3["attr2"] = '('+ df3['attr12']+ ',' +df3['attr22'] +')'
df3["attr3"] = '('+ df3['attr13']+ ',' +df3['attr23'] +')'
df3["attr4"] = '('+ df3['attr14']+ ',' +df3['attr24'] +')'
print (df3[['name','attr1','attr2','attr3','attr4',]])
输出
name attr1 attr2 attr3 attr4
0 a (t1a1,t2a1) (t1a2,t2a2) (t1a3,t2a3) (t1a4,t2a4)
1 b (t1b1,t2b1) (t1b2,t2b2) (t1b3,t2b3) (t1b4,t2b4)
2 c (t1c1,t2c1) (t1c2,t2c2) (t1c3,t2c3) (t1c4,t2c4)