我有像这样的数据框
df = pd.DataFrame({'User':['101','101','101','102','102','101','101','102','102','102'],'Country':['India','Japan','India','Brazil','Japan','UK','Austria','Japan','Singapore','UK'],
'Name':['RN','TN','AP','AP','TN','TN','TS','RN','TN','AP']})
如果用户和国家/地区相同,我想将名称列值组合在其他列中,如下所示
答案 0 :(得分:3)
你可以让
df['Name_E'] = df.groupby(['User', 'Country']).Name.transform(lambda x: str.join(', ', x))
答案 1 :(得分:1)
你需要:
res = df.merge(df.groupby(['User', 'Country'])['Name'].unique().reset_index().rename(columns={'Name':'Name_E'}), on=['Country', 'User'])
res['Name_E'] = res['Name_E'].apply(lambda x: ",".join(x))
输出:
User Country Name Name_E
0 101 India RN RN,AP
1 101 India AP RN,AP
2 101 Japan TN TN
3 102 Brazil AP AP
4 102 Japan TN TN,RN
5 102 Japan RN TN,RN
6 101 UK TN TN
7 101 Austria TS TS
8 102 Singapore TN TN
9 102 UK AP AP
答案 2 :(得分:1)
groupby
transform
df['all_names'] = df.groupby(['Country', 'User']).Name.transform(lambda x: ','.join(set(x)))
Country Name User all_names
0 India RN 101 AP,RN
1 Japan TN 101 TN
2 India AP 101 AP,RN
3 Brazil AP 102 AP
4 Japan TN 102 TN,RN
5 UK TN 101 TN
6 Austria TS 101 TS
7 Japan RN 102 TN,RN
8 Singapore TN 102 TN
9 UK AP 102 AP