在pandas中对每个n:n + k行进行分组,其中最后一行加上余数

时间:2017-10-12 14:46:43

标签: python pandas

我需要将行切成四肢。在这里,您看到组3是一组,有4组3.有两组4,其中第一个数据帧中的行数是4,第二个组4的数量是2。

 Go        Per  Votes      group
NaN  40.726126    NaN          3   
NaN  40.727271   36.0          3   
NaN  40.719560    NaN          3   
NaN  40.729198   19.0          3  
NaN  40.726126    NaN          4   
NaN  40.727271   36.0          4   
NaN  40.719560    NaN          4   
NaN  40.729198   19.0          4 
NaN  40.726126    NaN          4   
NaN  40.727271   36.0          4   
NaN  40.719560    NaN          5   
NaN  40.729198   19.0          5 

这就是我所拥有的

for i in unique_group:
    this_group = df_group[df_group['group'] == i]
    count_items = this_group.shape[0]
    if count_items > 4:
            remainder = count_items % 4
            divide = int(count_items / 4)
            repeat_group = divide
        else:
            repeat_group = 1
    for repeat in range(1, repeat_group+1):
        if count_items > 4:
            if repeat==repeat_group:
                this_group = this_group.iloc[:repeat*4+remainder,:]
                print "last group"
            elif repeat == 1:
                this_group = this_group.iloc[:repeat*4,:]
                print "first group"
            else:
                this_group = this_group.iloc[(repeat-1)*4+1:repeat*4,:]        
                print "between group"
        print this_group

我的输出是当我到第4组时,它只打印第一组,即使它说的是最后一组/或组之间(取决于我的列表有多长)。

1 个答案:

答案 0 :(得分:0)

我假设您的数据框名为df

def chunker(seq, size):
    return (seq[pos:pos + size] for pos in range(0, len(seq), size))

grouped = df.groupby('group')
groups = []
for _, gr in grouped:
    for chunk in chunker(gr, 4):
        groups.append(chunk)

for gr in groups:
    print(len(gr))

这将创建一个包含所有分块组的列表。