将每个唯一ID的行转换为所有逗号分隔的可能性

时间:2019-05-07 10:10:22

标签: python pandas dataframe combinations

我现在有以下格式的数据,此数据位于Pandas数据框中。

Row   Uid    Lender
1     1      HSBC
2     1      Lloyds
3     1      Barclays
4     2      Lloyds
5     2      Barclays
6     2      Santander
7     2      RBS
8     2      HSBC

我需要的是每个Uid的Lenders列的所有可能组合,因此输出将是这样

Row   Uid   LenderCombo
1     1     Barclays
2     1     Lloyds
3     1     HSBC
4     1     Barclays, HSBC
5     1     Barclays, Lloyds
6     1     HSBC, Lloyds 
7     1     Barclays, HSBC, Lloyds

与Uid 2等相同,如果在我不确定如何解决此问题之前已经回答了此问题,则表示歉意。

谢谢

1 个答案:

答案 0 :(得分:5)

GroupBy.apply与自定义功能一起使用,并通过join连接元组:

from itertools import chain, combinations

#https://stackoverflow.com/a/5898031
def all_subsets(ss):
    return chain(*map(lambda x: combinations(ss, x), range(1, len(ss)+1)))

df = (df.groupby('Uid')['Lender']
       .apply(lambda x: pd.Series([', '.join(y) for y in all_subsets(x)]))
       .reset_index()
       .rename(columns={'level_1':'Row'}))

print (df)
    Uid  Row                                  Lender
0     1    0                                    HSBC
1     1    1                                  Lloyds
2     1    2                                Barclays
3     1    3                            HSBC, Lloyds
4     1    4                          HSBC, Barclays
5     1    5                        Lloyds, Barclays
6     1    6                  HSBC, Lloyds, Barclays
7     2    0                                  Lloyds
8     2    1                                Barclays
9     2    2                               Santander
10    2    3                                     RBS
11    2    4                                    HSBC
12    2    5                        Lloyds, Barclays
13    2    6                       Lloyds, Santander
14    2    7                             Lloyds, RBS
15    2    8                            Lloyds, HSBC
16    2    9                     Barclays, Santander
17    2   10                           Barclays, RBS
18    2   11                          Barclays, HSBC
19    2   12                          Santander, RBS
20    2   13                         Santander, HSBC
21    2   14                               RBS, HSBC
22    2   15             Lloyds, Barclays, Santander
23    2   16                   Lloyds, Barclays, RBS
24    2   17                  Lloyds, Barclays, HSBC
25    2   18                  Lloyds, Santander, RBS
26    2   19                 Lloyds, Santander, HSBC
27    2   20                       Lloyds, RBS, HSBC
28    2   21                Barclays, Santander, RBS
29    2   22               Barclays, Santander, HSBC
30    2   23                     Barclays, RBS, HSBC
31    2   24                    Santander, RBS, HSBC
32    2   25        Lloyds, Barclays, Santander, RBS
33    2   26       Lloyds, Barclays, Santander, HSBC
34    2   27             Lloyds, Barclays, RBS, HSBC
35    2   28            Lloyds, Santander, RBS, HSBC
36    2   29          Barclays, Santander, RBS, HSBC
37    2   30  Lloyds, Barclays, Santander, RBS, HSBC