累积总和和遗留物 - 与熊猫矢量化

时间:2018-05-04 18:23:35

标签: python python-2.7 pandas

我是否可以使用pandas.Series数学在ab上进行以下操作而无需明确循环?

In [38]: a = pd.Series([4, 8, 3, 6, 2])

In [39]: b = pd.Series([3, 9, 5, 5, 4])

In [40]: alist = a.tolist()
    ...: blist = b.tolist()
    ...: for i in range(len(alist)):
    ...:     diff = max(0, alist[i] - blist[i])
    ...:     try:
    ...:         alist[i + 1] = alist[i + 1] + diff
    ...:     except IndexError:
    ...:         if diff > 0:
    ...:             alist.append(diff)
    ...:     blist[i] = max(0, blist[i] - alist[i])
    ...: 

In [41]: alist
Out[41]: [4, 9, 3, 6, 3]

In [42]: blist
Out[42]: [0, 0, 2, 0, 1]

如果a和b的差值大于零,我将增加a的下一个值,然后从该累积和类似的calc中减去b。

4 个答案:

答案 0 :(得分:2)

IIUc,您需要shift(此行可以替换为shift alist[i + 1] = alist[i + 1] + diff

alist=a.add((a-b).clip(lower=0).shift(),fill_value=0).astype(int)
blist=(b-alist).clip_lower(0)
alist
Out[340]: 
0    4
1    9
2    3
3    6
4    3

blist
Out[341]: 
0    0
1    0
2    2
3    0
4    1

答案 1 :(得分:2)

这是使用activity.getClass()的一种方式:

numpy

答案 2 :(得分:1)

这是使用whereroll的另一种简洁方法:

alist = np.where(np.roll(a - b > 0, 1), a + np.roll(a - b, 1), a)
blist = np.maximum(b.values - alist, 0)

print alist
# [4 9 3 6 3]
print blist
# [0 0 2 0 1]

答案 3 :(得分:0)

请考虑以下使用.shift()然后roll()的代码。

rawdata <- data.frame('var1' = runif(100,1,100),
                      'var2' = runif(100,1,100))

library(ggplot2)

p_val <- .286

ggplot(rawdata,aes(x=1:100,y=var1)) + geom_line() + 
  annotate("text",x=50,y=10,label=paste0('atop(bold("p_value is ',p_val,'"))'),cex=7,parse=TRUE)

输出:

df=pd.DataFrame({
    'a': a,
    'b': b
})
alist = list(np.roll((df['a'].shift(-1)+(df['a']-df['b']).clip(lower=0)).fillna(df.iloc[0]['a']), 1).astype(int))
blist = list((df['b'] - alist).clip(lower=0))
print(allist)
print(blist)