在给定条件列的情况下,查找熊猫数据框中各行之间的值和日期之差

时间:2018-08-29 16:15:30

标签: python-3.x pandas indexing difference datediff

我试图找到一种方法来计算特定列的值之差,以及基于第三列的值(分别为0和1)计算日期之差。

我的初始数据框如下:

enter image description here

df = pd.DataFrame({'value':[-15, -10, 40, -25, -50,-90, 200], 
                   'date': ['2018-01-20', '2018-01-19','2018-01-19',
                            '2018-01-18', '2018-01-17','2018-01-16', 
                            '2018-01-15'],
                   'flag':[0,0,1,0,0,0,1]})

只要value列大于零,则flag列的值为1,否则为0。假设它是按日期排序的。给定此数据框,我想计算标志相对于最近的较早日期标志等于1的每一行的值和日期的变化。

生成的df应该如下所示:

enter image description here

在这里,我们第一次获得正值是40。40和-10之间的差是30,该值和-15之间的累积差是15。

1 个答案:

答案 0 :(得分:1)

那没什么不同(my.data<-read.csv(file.choose(),header=T) attach(my.data) mean.worm=tapply(yvar,xfact, mean) #means sd.worm=tapply(yvar,xfact,sd) #standard devs n.worm=tapply(yvar,xfact, length) # number pe group sem.worm = tapply(yvar,xfact,sd)/ sqrt(tapply(yvar,xfact, length) ) mean.worm sd.worm n.worm sem.worm mids = barplot(mean.worm,ylab = "MRGR", ylim = c(0,0.7)) arrows(mids, mean.worm-sem.worm, mids, mean.worm + sem.worm, code = 3, angle = 90, length = 0.1) text(mids,0.1, paste("n=", n.worm)) stripchart(yvar~xfact, data = my.data, vertical = TRUE, method = "jitter", pch = 21, col = "red", bg = "yellow", add = TRUE) ),这是diff的值

sum