Python,Pandas 数据框,合并具有相同两列值的数据框的行并聚合行中的数据

时间:2021-07-31 00:25:45

标签: python python-3.x pandas dataframe pandas-groupby

我有这个 DF,我正在尝试合并具有相似 workDateID 的任意两行。我不知道这个 DF 会有多少列。它可能有数百列,所以我正在寻找一种无需指定所有列名即可合并的方法

    |workDate   |ID       | Hours |Groundsman  |names2  |Teachers       |Profs
---------------------------------------------------------------------------------
0   |2020-01-09 |13702    | 1.0   |   Ted      |        |               |
1   |2020-01-09 |13702    | 1.0   |            |Seline  |               |
2   |2020-01-10 |13702    | 20.0  |            |Ted,Sam |               |
3   |2020-01-10 |13702    | 20.0  |            |        |Pete,Norm,Tim  |
4   |2020-01-10 |13702    | 20.0  |            |        |               |Joe

所需的输出:

    |workDate   |ID       | Hours |Groundsman  |names2  |Teachers       |Profs
---------------------------------------------------------------------------------
0   |2020-01-09 |13702    | 1.0   |   Ted      |Seline  |               |
1   |2020-01-10 |13702    | 20.0  |            |Ted,Sam |Pete,Norm,Tim  |Joe

我尝试了多种分组方式,但没有得到我想要的输出。正如我提到的 DF 可能有很多其他列,所以我不能使用期望包含所有列名称的解决方案。 它基本上按 workDateID 分组,但我也想将所有值保留在其他列中

1 个答案:

答案 0 :(得分:3)

尝试:

import pandas as pd
import numpy as np #only required because the example df creation needs np.nan

df = pd.DataFrame({'workDate': {0: '2020-01-09',
  1: '2020-01-09',
  2: '2020-01-10',
  3: '2020-01-10',
  4: '2020-01-10'},
 'ID': {0: 13702, 1: 13702, 2: 13702, 3: 13702, 4: 13702},
 ' Hours': {0: 1.0, 1: 1.0, 2: 20.0, 3: 20.0, 4: 20.0},
 'Groundsman': {0: '   Ted', 1: np.nan, 2: np.nan, 3: np.nan, 4: np.nan},
 'names2': {0: np.nan, 1: 'Seline', 2: 'Ted,Sam', 3: np.nan, 4: np.nan},
 'Teachers': {0: np.nan, 1: np.nan, 2: np.nan, 3: 'Pete,Norm,Tim', 4: np.nan},
 'Profs': {0: np.nan, 1: np.nan, 2: np.nan, 3: np.nan, 4: 'Joe'}})

print(df.groupby(by=['workDate', 'ID']).last().reset_index())

输出:

     workDate     ID   Hours Groundsman   names2       Teachers Profs
0  2020-01-09  13702     1.0        Ted   Seline           None  None
1  2020-01-10  13702    20.0       None  Ted,Sam  Pete,Norm,Tim   Joe