在每个特定行的数据框中获取操作结果

时间:2019-08-15 12:55:59

标签: python dataframe

我有44行x 4列的数据。我想对每11行进行求和除法,但是在我的函数中,我的错误是我计算了整行中的求和与除法。

请建议我最简单的解决方案,也许在数据框中使用迭代?

import pandas as pd
data = pd.DataFrame({'A':[1,2,3,1,2,3,1,2,3,2,2,4,5,6,4,5,6,4,5,6,1,1,1,3,5,1,3,5,1,3,5,4,1,7,8,9,7,8,9,7,8,9,4,2],
                     'B':[4,5,6,4,5,6,4,5,6,1,1,1,3,5,1,3,5,1,3,5,4,1,4,5,6,1,1,1,3,5,1,3,6,3,9,7,8,9,4,2,7,8,9,2],
                     'C':[7,8,9,7,8,9,7,8,9,4,2,2,3,2,2,4,5,6,4,3,6,3,9,7,8,9,4,2,7,8,9,7,8,9,7,8,9,4,2,2,1,3,5,4],
                     'D':[1,3,5,1,3,5,1,3,5,4,1,7,8,9,7,8,9,7,8,9,4,2,7,8,9,7,8,9,7,8,9,4,2,2,3,2,2,4,5,6,4,3,6,3]}
                    )


a = data[['A','B','C','D']].sum()
b = data[['A','B','C','D']] / a

data_div = b.round(4)

这是我期望的一个例子。在下图中,我将A列中的每4行相加并除以

enter image description here

3 个答案:

答案 0 :(得分:2)

这看起来像您期望的那样:

import pandas as pd
data = pd.DataFrame({'A':[1,2,3,1,2,3,1,2,3,2,2,4,5,6,4,5,6,4,5,6,1,1,1,3,5,1,3,5,1,3,5,4,1,7,8,9,7,8,9,7,8,9,4,2],
                 'B':[4,5,6,4,5,6,4,5,6,1,1,1,3,5,1,3,5,1,3,5,4,1,4,5,6,1,1,1,3,5,1,3,6,3,9,7,8,9,4,2,7,8,9,2],
                 'C':[7,8,9,7,8,9,7,8,9,4,2,2,3,2,2,4,5,6,4,3,6,3,9,7,8,9,4,2,7,8,9,7,8,9,7,8,9,4,2,2,1,3,5,4],
                 'D':[1,3,5,1,3,5,1,3,5,4,1,7,8,9,7,8,9,7,8,9,4,2,7,8,9,7,8,9,7,8,9,4,2,2,3,2,2,4,5,6,4,3,6,3]}
                )


chunk_len = 11
result = pd.DataFrame()
for i in range(4):
    res = data[i*chunk_len:(i+1)*chunk_len]/data[i*chunk_len:(i+1)*chunk_len].sum()
    if result.empty:
        result = res
    else:
        result = result.append(res)

print(result)

答案 1 :(得分:0)

假设我正确理解了您的问题,那么您希望将数据帧汇总为11行。一种方法是:

result = data.iloc[0:11].sum().sum()

第一个.sum()返回前10行除以列的总和,第二个.sum()返回这些总和以得到总和。对于数据帧的不同切片,您可以通过放入所需的切片来更改行选择(例如data.iloc [11:23]等)。

完全相同的逻辑也适用于除法。

答案 2 :(得分:0)

您可以尝试按N行进行分组,然后应用总和:

df.index = [i // 7  for i in range(len(df))]
df['sum_A'] = df["A"].groupby(df.index).sum()
df['div_A'] = df["A"] / df['sum_A']

完整代码:

df = pd.DataFrame({'A':[1,2,3,1,2,3,1,2,3,2,2,4,5,6,4,5,6,4,5,6,1,1,1,3,5,1,3,5,1,3,5,4,1,7,8,9,7,8,9,7,8,9,4,2],
                    'B':[4,5,6,4,5,6,4,5,6,1,1,1,3,5,1,3,5,1,3,5,4,1,4,5,6,1,1,1,3,5,1,3,6,3,9,7,8,9,4,2,7,8,9,2],
                    'C':[7,8,9,7,8,9,7,8,9,4,2,2,3,2,2,4,5,6,4,3,6,3,9,7,8,9,4,2,7,8,9,7,8,9,7,8,9,4,2,2,1,3,5,4],
                    'D':[1,3,5,1,3,5,1,3,5,4,1,7,8,9,7,8,9,7,8,9,4,2,7,8,9,7,8,9,7,8,9,4,2,2,3,2,2,4,5,6,4,3,6,3]}
                    )

df.index = [i // 11  for i in range(len(df))]     # Define new index for groupby
df['sum_A'] = df["A"].groupby(df.index).sum()     # Apply sum per group
df['div_A'] = df["A"] / df['sum_A']               # Divide each row by the given sum
print(df)
#    A  B  C  D  sum_A     div_A
# 0  1  4  7  1     22  0.045455
# 0  2  5  8  3     22  0.090909
# 0  3  6  9  5     22  0.136364
# 0  1  4  7  1     22  0.045455
# 0  2  5  8  3     22  0.090909
# 0  3  6  9  5     22  0.136364
# 0  1  4  7  1     22  0.045455
# 0  2  5  8  3     22  0.090909
# 0  3  6  9  5     22  0.136364
# 0  2  1  4  4     22  0.090909
# 0  2  1  2  1     22  0.090909
# 1  4  1  2  7     47  0.085106
# 1  5  3  3  8     47  0.106383
# 1  6  5  2  9     47  0.127660
# 1  4  1  2  7     47  0.085106
# 1  5  3  4  8     47  0.106383
# 1  6  5  5  9     47  0.127660
# 1  4  1  6  7     47  0.085106
# 1  5  3  4  8     47  0.106383
# 1  6  5  3  9     47  0.127660
# 1  1  4  6  4     47  0.021277
# 1  1  1  3  2     47  0.021277
# 2  1  4  9  7     32  0.031250
# 2  3  5  7  8     32  0.093750
# 2  5  6  8  9     32  0.156250
# 2  1  1  9  7     32  0.031250
# 2  3  1  4  8     32  0.093750
# 2  5  1  2  9     32  0.156250
# 2  1  3  7  7     32  0.031250
# 2  3  5  8  8     32  0.093750
# 2  5  1  9  9     32  0.156250
# 2  4  3  7  4     32  0.125000
# 2  1  6  8  2     32  0.031250
# 3  7  3  9  2     78  0.089744
# 3  8  9  7  3     78  0.102564
# 3  9  7  8  2     78  0.115385
# 3  7  8  9  2     78  0.089744
# 3  8  9  4  4     78  0.102564
# 3  9  4  2  5     78  0.115385
# 3  7  2  2  6     78  0.089744
# 3  8  7  1  4     78  0.102564
# 3  9  8  3  3     78  0.115385
# 3  4  9  5  6     78  0.051282
# 3  2  2  4  3     78  0.025641

希望有帮助!