我试图找到一种方法来计算特定列的值之差,以及基于第三列的值(分别为0和1)计算日期之差。
我的初始数据框如下:
df = pd.DataFrame({'value':[-15, -10, 40, -25, -50,-90, 200],
'date': ['2018-01-20', '2018-01-19','2018-01-19',
'2018-01-18', '2018-01-17','2018-01-16',
'2018-01-15'],
'flag':[0,0,1,0,0,0,1]})
只要value列大于零,则flag列的值为1,否则为0。假设它是按日期排序的。给定此数据框,我想计算标志相对于最近的较早日期标志等于1的每一行的值和日期的变化。
生成的df应该如下所示:
在这里,我们第一次获得正值是40。40和-10之间的差是30,该值和-15之间的累积差是15。
答案 0 :(得分:1)
那没什么不同(my.data<-read.csv(file.choose(),header=T)
attach(my.data)
mean.worm=tapply(yvar,xfact, mean) #means
sd.worm=tapply(yvar,xfact,sd) #standard devs
n.worm=tapply(yvar,xfact, length) # number pe group
sem.worm = tapply(yvar,xfact,sd)/ sqrt(tapply(yvar,xfact, length) )
mean.worm
sd.worm
n.worm
sem.worm
mids = barplot(mean.worm,ylab = "MRGR", ylim = c(0,0.7))
arrows(mids, mean.worm-sem.worm, mids, mean.worm + sem.worm, code = 3, angle = 90, length = 0.1)
text(mids,0.1, paste("n=", n.worm))
stripchart(yvar~xfact, data = my.data, vertical = TRUE, method = "jitter", pch = 21, col = "red", bg = "yellow", add = TRUE)
),这是diff
的值
sum