我有这个数据框
df = pd.DataFrame({'alpha': ['ab', 'ab', 'ab', 'cd','cd','cd'],
'beta': ['12', '34','56','78','90','22'],})
df
对于名为' alpha'的列中的每个组。我想生成一个名为' gamma'的新列。列' beta'和伽玛'表示两列的所有排列。
df1 = pd.DataFrame({'alpha': ['ab', 'ab', 'ab', 'ab', 'ab', 'ab','cd','cd','cd','cd','cd','cd'],
'beta': ['12', '34','56','12', '56','34' , '78','90','22','22','78','90' ],
'gamma': ['34', '12','12','56', '34','56' , '90','78','78','90','22','22' ]})
df1
我试过以下
from itertools import permutations, product
df['gamma']= df['beta']
dfg = df.groupby('alpha')
perms = {}
for a, v in dfg:
perms[a] = list(permutations(v.values))
print(perms)
pd.DataFrame(perms)
答案 0 :(得分:1)
根据您的要求,您的代码实际上错误地使用了permutations
。您需要仅基于beta
列进行置换,并使itertools.permutations
一次取2个元素。示例 -
from itertools import permutations
grouped = df.groupby('alpha')
resultlist = []
for key,group in grouped:
for b,g in permutations(group['beta'].tolist(),2):
resultlist.append([key,b,g])
result = pd.DataFrame(resultlist,columns=['alpha','beta','gamma'])
演示 -
In [29]: df
Out[29]:
alpha beta
0 ab 12
1 ab 34
2 ab 56
3 cd 78
4 cd 90
5 cd 22
In [30]: grouped = df.groupby('alpha')
In [31]: resultlist = []
In [32]: for key,group in grouped:
....: for b,g in itertools.permutations(group['beta'].tolist(),2):
....: resultlist.append([key,b,g])
....:
In [33]: result = pd.DataFrame(resultlist,columns=['alpha','beta','gamma'])
In [34]: result
Out[34]:
alpha beta gamma
0 ab 12 34
1 ab 12 56
2 ab 34 12
3 ab 34 56
4 ab 56 12
5 ab 56 34
6 cd 78 90
7 cd 78 22
8 cd 90 78
9 cd 90 22
10 cd 22 78
11 cd 22 90
答案 1 :(得分:1)
您可以使用apply
In [192]: (df.groupby('alpha')
.apply(lambda x: pd.DataFrame(list(permutations(x['beta'], 2))))
.reset_index())
Out[192]:
alpha level_1 0 1
0 ab 0 12 34
1 ab 1 12 56
2 ab 2 34 12
3 ab 3 34 56
4 ab 4 56 12
5 ab 5 56 34
6 cd 0 78 90
7 cd 1 78 22
8 cd 2 90 78
9 cd 3 90 22
10 cd 4 22 78
11 cd 5 22 90
In [193]: dff = (df
.groupby('alpha')
.apply(lambda x: pd.DataFrame(list(permutations(x['beta'], 2))))
.reset_index())
In [194]: dff = dff[['alpha', 0, 1]]
In [195]: dff.columns = ['alpha', 'beta', 'gamma']
In [196]: dff
Out[196]:
alpha beta gamma
0 ab 12 34
1 ab 12 56
2 ab 34 12
3 ab 34 56
4 ab 56 12
5 ab 56 34
6 cd 78 90
7 cd 78 22
8 cd 90 78
9 cd 90 22
10 cd 22 78
11 cd 22 90