Question

我需要将行切成四肢。在这里，您看到组3是一组，有4组3.有两组4，其中第一个数据帧中的行数是4，第二个组4的数量是2。

 Go        Per  Votes      group
NaN  40.726126    NaN          3   
NaN  40.727271   36.0          3   
NaN  40.719560    NaN          3   
NaN  40.729198   19.0          3  
NaN  40.726126    NaN          4   
NaN  40.727271   36.0          4   
NaN  40.719560    NaN          4   
NaN  40.729198   19.0          4 
NaN  40.726126    NaN          4   
NaN  40.727271   36.0          4   
NaN  40.719560    NaN          5   
NaN  40.729198   19.0          5

这就是我所拥有的

for i in unique_group:
    this_group = df_group[df_group['group'] == i]
    count_items = this_group.shape[0]
    if count_items > 4:
            remainder = count_items % 4
            divide = int(count_items / 4)
            repeat_group = divide
        else:
            repeat_group = 1
    for repeat in range(1, repeat_group+1):
        if count_items > 4:
            if repeat==repeat_group:
                this_group = this_group.iloc[:repeat*4+remainder,:]
                print "last group"
            elif repeat == 1:
                this_group = this_group.iloc[:repeat*4,:]
                print "first group"
            else:
                this_group = this_group.iloc[(repeat-1)*4+1:repeat*4,:]        
                print "between group"
        print this_group

我的输出是当我到第4组时，它只打印第一组，即使它说的是最后一组/或组之间（取决于我的列表有多长）。

Answer 1

我假设您的数据框名为df。

def chunker(seq, size):
    return (seq[pos:pos + size] for pos in range(0, len(seq), size))

grouped = df.groupby('group')
groups = []
for _, gr in grouped:
    for chunk in chunker(gr, 4):
        groups.append(chunk)

for gr in groups:
    print(len(gr))

这将创建一个包含所有分块组的列表。

在pandas中对每个n：n + k行进行分组，其中最后一行加上余数

1 个答案: