来自每个组的第一个实例的数据帧中的行差计算

时间:2014-11-18 05:38:27

标签: r dataframe plyr dplyr lapply

输入数据

ID   value  
a    10  
a    12  
a    18  
a    13  
b    23  
b    25  
b    33  
c    17  
c    23  
c    27  

OUTPUT数据应该看起来像

ID   value     Diff  
a    10        0  
a    12        2    
a    18        8  
a    13        3  
b    23        0   
b    25        2  
b    33       10  
c    17        0  
c    23        6  
c    27       10     

我从网上获得了这段代码

library(data.table)  
DT <- as.data.table(dat)  
DT[, `:=`(DIFTIME, c(0, diff(as.Date(DATETIME)))), by = "ID"]  

但这只会在两个连续行之间产生差异,而不是来自该组的第一个实例

dat<-structure(list(ID = c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L),
DATETIME = structure(c(1328346000,1328479200, 1331024400,1331025400, 1328086800, 1328184000,   1336287600, 1336424400),
class = c("POSIXct", "POSIXt"), tzone = ""),
VALUE = c(300L,150L, 650L, 450L, 855L, 240L, 340L, 240L)),
.Names = c("ID", "DATETIME","VALUE"), class = "data.frame", row.names = c(NA, 7L))   

3 个答案:

答案 0 :(得分:4)

您还可以使用dplyr,其中df是原始数据

library(dplyr)
group_by(df, ID) %>% mutate(Diff = value - first(value))
#    ID value Diff
# 1   a    10    0
# 2   a    12    2
# 3   a    18    8
# 4   a    13    3
# 5   b    23    0
# 6   b    25    2
# 7   b    33   10
# 8   c    17    0
# 9   c    23    6
# 10  c    27   10

答案 1 :(得分:3)

使用data.table

setDT(df)[, Diff:=value-value[1], by=ID]
df
 #   ID value Diff
 #1:  a    10    0
 #2:  a    12    2
 #3:  a    18    8
 #4:  a    13    3
 #5:  b    23    0
 #6:  b    25    2
 #7:  b    33   10
 #8:  c    17    0
 #9:  c    23    6
#10:  c    27   10

数据

df <- structure(list(ID = c("a", "a", "a", "a", "b", "b", "b", "c", 
"c", "c"), value = c(10L, 12L, 18L, 13L, 23L, 25L, 33L, 17L, 
23L, 27L)), .Names = c("ID", "value"), class = "data.frame", row.names = c(NA, 
-10L))

答案 2 :(得分:1)

您可以使用ave函数在基础R中执行此操作。

dat$Diff <- ave( dat$value, dat$ID, FUN = function(x) x - x[1] )