Cumsum整个表并重置为零

时间:2017-04-20 16:27:09

标签: python-2.7 pandas

我有以下数据框。

Task.ConfigureAwait(false)

我想要重置为零的累积和

所需的输出应为

d = pd.DataFrame({'one' : [0,1,1,1,0,1],'two' : [0,0,1,0,1,1]})

d

   one  two
0    0    0
1    1    0
2    1    1
3    1    0
4    0    1
5    1    1

我尝试过使用group by,但它不适用于整个表格。

4 个答案:

答案 0 :(得分:4)

df2 =  df.apply(lambda x: x.groupby((~x.astype(bool)).cumsum()).cumsum())
print(df2)

输出:

   one  two
0    0    0
1    1    0
2    2    1
3    3    0
4    0    1
5    1    2

答案 1 :(得分:3)

pandas

def cum_reset_pd(df):
    csum = df.cumsum()
    return (csum - csum.where(df == 0).ffill()).astype(d.dtypes)

cum_reset_pd(d)

   one  two
0    0    0
1    1    0
2    2    1
3    3    0
4    0    1
5    1    2

numpy

def cum_reset_np(df):
    v = df.values
    z = np.zeros_like(v)
    j, i = np.where(v.T)
    r = np.arange(1, i.size + 1)
    p = np.where(
        np.append(False, (np.diff(i) != 1) | (np.diff(j) != 0))
    )[0]
    b = np.append(0, np.append(p, r.size))
    z[i, j] = r - b[:-1].repeat(np.diff(b))
    return pd.DataFrame(z, df.index, df.columns)

cum_reset_np(d)

   one  two
0    0    0
1    1    0
2    2    1
3    3    0
4    0    1
5    1    2

为什么要经历这个麻烦?
因为它更快!

enter image description here

答案 2 :(得分:0)

这应该这样做:

d = {'one' : [0,1,1,1,0,1],'two' : [0,0,1,0,1,1]}
one = d['one']
two = d['two']
i = 0
new_one = []
for item in one:
    if item == 0:
        i = 0
    else:
        i += item
    new_one.append(i)

j = 0
new_two = []
for item in two:
    if item == 0:
        j = 0
    else:
        j += item
    new_two.append(j)

d['one'], d['two'] = new_one, new_two
df = pd.DataFrame(d)

答案 3 :(得分:0)

这个没有使用Pandas,但使用NumPy和列表推导:

--no-ask

首先,我找到import numpy as np d = {'one': [0,1,1,1,0,1], 'two': [0,0,1,0,1,1]} out = {} for key in d.keys(): l = d[key] indices = np.argwhere(np.array(l)==0).flatten() indices = np.append(indices, len(l)) out[key] = np.concatenate([np.cumsum(l[indices[n-1]:indices[n]]) \ for n in range(1, indices.shape[0])]).ravel() print(out) 的所有出现(分割列表的位置),然后我计算生成的子列表的0并将它们插入到新的cumsum中。