Question

让我们说我们有以下数据框：

mapAsString.append("\"" + key + "\":\"" + billingDetails.get(key) + "\", ");

或

df = pd.DataFrame(
    data={
        'from': [103, 102, 104, 105],
        'to': [104, 105, 103, 102],
        'id': [1] * 4,
        'p': [415, 1203.11, -414.35, -1197.37],
        'q': [0, -395.44, 62.23, 489.83]
    })

目标是合并具有相同from to id p q 0 103 104 1 415.00 0.00 1 102 105 1 1203.11 -395.44 2 104 103 1 -414.35 62.23 3 105 102 1 -1197.37 489.83和from值的行。在上面的示例中，第0行和第2行以及第1行和第3行需要合并。

输出应该如下：

to

当然，以下内容也是可以接受的：

   from      to  id        p       q       p1      q1
0   103     104   1   415.00    0.00  -414.35   62.23
1   102     105   1  1203.11 -395.44 -1197.37  489.83

感谢您的帮助：）

Answer 1

首先通过numpy.sort对from和to列进行排序，然后通过GroupBy.cumcount创建计数器Series，通过DataFrame.set_index和{{ 3}}，按DataFrame.unstack排序第二级，最后用MultiIndex展平f-strings并按DataFrame.sort_index将Multiindex in index转换为列：

df[['from','to']] = np.sort(df[['from','to']], axis=1)
g = df.groupby(['from','to']).cumcount()

df = df.set_index(['from','to','id', g]).unstack().sort_index(level=1, axis=1)
df.columns = [f'{a}{b}' for a, b in df.columns]
df = df.reset_index()
print(df)
   from      to  id       p0      q0       p1      q1
0   103     104   1   415.00    0.00  -414.35   62.23
1   444  999230   1  1203.11 -395.44 -1197.37  489.83

Answer 2

另一种解决方案：

#sort from and to first
df[['from', 'to']]=np.sort(df[['from', 'to']])
(
    df.groupby(['from', 'to'])
    #groupby and concatenate all q and q in the same group to 1 row
    .apply(lambda x:  x[['p','q']].values.reshape(1,-1)[0])
    #convert the list of p and q to a DataFrame
    .pipe(lambda x: pd.DataFrame(x.tolist(), index=x.index))
    #rename the columns
    .rename(columns=lambda x: f'p{x//2}')
    .reset_index()
)

    from    to      p0      p0      p1          p1
0   103     104     415.00  0.00    -414.35     62.23
1   444     999230  1203.11 -395.44 -1197.37    489.83

合并两行数据框

2 个答案: