我的数据框如下:
name<-c("ab","ab","ab","ac","ac","ac","d","d","d")
value<-c(9,9,6,10,10,4,8,9,8)
week<-c(31,31,32,31,31,35,32,33,35)
c<-data.frame(name,value,week)
如果以前的name
存在,我想创建一个新列,其中每个week
的星期值之间有差异。如果不是0,将显示。对于下面的数据框,答案将是:
name value week df
1 ab 9 31 0
2 ab 9 31 0
3 ab 6 32 -3
4 ac 10 31 0
5 ac 10 31 0
6 ac 4 35 0
7 d 8 32 0
8 d 9 33 1
9 d 8 35 0
答案 0 :(得分:2)
使用 dplyr 十分容易,并且需要一点数学运算才能确保仅在上周比当前行的周少1时才显示差异:
library(dplyr)
c <- c %>%
group_by(name) %>%
mutate(df = c(0, diff(value)) * as.numeric(c(0, diff(week)) == 1))
name value week df
<fct> <dbl> <dbl> <dbl>
1 ab 9 31 0
2 ab 9 31 0
3 ab 6 32 -3
4 ac 10 31 0
5 ac 10 31 0
6 ac 4 35 0
7 d 8 32 0
8 d 9 33 1
9 d 8 35 0
答案 1 :(得分:2)
命名data.frame df
和新列diff
,这是使用data.table的一种方法:
library(data.table)
setDT(df)
df[ , diff := ifelse(week-shift(week)==1, value-shift(value), 0), by=name]
df[is.na(diff), diff := 0]
答案 2 :(得分:1)
data.table
使用对临时集合的联接将一周向前移动了一个步骤:
library(data.table)
dat <- as.data.table(c)
dat[
unique(dat[,c(.SD,.(week1=week+1))]),
on=c("name","week"="week1"),
dfr := value-i.value
]
dat
# name value week dfr
#1: ab 9 31 NA
#2: ab 9 31 NA
#3: ab 6 32 -3
#4: ac 10 31 NA
#5: ac 10 31 NA
#6: ac 4 35 NA
#7: d 8 32 NA
#8: d 9 33 1
#9: d 8 35 NA