Question

我有2个数据框。一个是空的，另一个是很多行。我想将数据框与值分组，然后切片每组的前3行，并将其添加到空数据框。我希望每3行都放入一个新列中。

我已经尝试过，连接，加入，追加..但是我不知道如何...

到目前为止，我的代码：

df = pd.Dataframe()
df2 = pd.DataFrame({'C': [20, 20, 20, 20, 10, 10, 10, 30, 30, 30],
                   'D': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]})

df_dictionary = df2.groupby("C")

for key, df_values in df_dictionary:
    df_values = df_values.head(3)
    df = pd.concat(df, df_values["D"], axis=1)
    print(df)

结果看起来像是空白数据框：

index   col 1   col 2   col 3
0   1   5   8
1   2   6   9
2   3   7   10

我想将每个组的D列中的前3个值添加到空数据框中，并每次将它们放在新的列中。

有人有建议吗？

Answer 1

我在cumcount之前使用pivot

n=3 
df2.assign(key=df2.groupby('C').cumcount()).pivot(index='key',columns='C',values='D').iloc[:n,:]
Out[730]: 
C     10   20    30
key                
0    5.0  1.0   8.0
1    6.0  2.0   9.0
2    7.0  3.0  10.0

Answer 2

此答案有一个要求： 每个组每个组必须至少具有n个值 。

使用head + reshape

n = 3
u = df2.groupby('C').head(n)['D'].values

pd.DataFrame(u.reshape(-1, n, order='F'), columns=[f'col {i+1}' for i in range(n)])

   col 1  col 2  col 3
0      1      5      8
1      2      6      9
2      3      7     10

Answer 3

我的解决方案利用groupby.groups返回的字典来构造新的数据框

gb = df2.set_index('D').groupby('C')
pd.DataFrame.from_dict(gb.groups, orient='index').iloc[:,:3].T

Out[2033]:
   10  20  30
0   5   1   8
1   6   2   9
2   7   3  10

或在head之后使用T

pd.DataFrame.from_dict(gb.groups, orient='index').T.head(3)

Out[2034]:
    10   20    30
0  5.0  1.0   8.0
1  6.0  2.0   9.0
2  7.0  3.0  10.0

将数据框的切片添加到新列中的另一个数据框

3 个答案: