计算同一表中不同行之间的增量

时间:2016-11-03 14:50:11

标签: r

我有一张桌子,其中包含来自不同仪表的大量测量值。每个测量值都存储在一个新行中,并具有实际的仪表值。我需要在每米连续测量之间有所不同。

Simplified imput:

 [2016-11-03,MeterA,45]
 [2016-11-03,MeterB,45]
 [2016-11-04,MeterA,47]
 [2016-11-04,MeterB,54]

目前我正在使用几个for循环进行此操作,但这需要很长时间,并且可能有一种更有效的方法。代码目前

data$diff <- 0;
for(address in unique(data$Address)){
    subaddr <- subset(data, data$Address== address)
    for(meter in unique(subaddr$Meter)){
        submeter <- subset(subaddr, subaddr$Meter == meter)
        for (i in 1:nrow(submeter)){
            if(i > 1){
                prow = submeter[i-1,]
                row = submeter[i,]
                data[which(data$Address ==  address & data$Meter == meter &    data$UCPTlogTime == row$UCPTlogTime),]$diff <- row$UCPTvalue - prow$UCPTvalue
             }    
          }
     }
}

期望的输出

 [2016-11-03,MeterA,0]
 [2016-11-03,MeterB,0]
 [2016-11-04,MeterA,2]
 [2016-11-04,MeterB,9]

4 个答案:

答案 0 :(得分:2)

以下是使用data.table的一种方法:

library(data.table)
dt <- data.table(df)

dt[,delta := c(0, diff(value)), by = "group"][]
#           date group value delta
#  1: 2016-11-04     A    24     0
#  2: 2016-11-04     B    24     0
#  3: 2016-11-05     A    30     6
#  4: 2016-11-05     B    31     7
#  5: 2016-11-06     A    36     6
#  6: 2016-11-06     B    38     7
#  7: 2016-11-07     A    44     8
#  8: 2016-11-07     B    46     8
#  9: 2016-11-08     A    51     7
# 10: 2016-11-08     B    54     8
# 11: 2016-11-09     A    57     6
# 12: 2016-11-09     B    56     2
# 13: 2016-11-10     A    61     4
# 14: 2016-11-10     B    61     5
# 15: 2016-11-11     A    68     7
# 16: 2016-11-11     B    69     8
# 17: 2016-11-12     A    72     4
# 18: 2016-11-12     B    73     4
# 19: 2016-11-13     A    81     9
# 20: 2016-11-13     B    82     9
df <- data.frame(
    date = rep(Sys.Date() + 1:10, each = 2),
    group = rep(c("A", "B"), 10),
    value = rpois(2, 20) + cumsum(rpois(20, 3)),
    stringsAsFactors = FALSE
)

答案 1 :(得分:2)

dplyr使用lag函数轻而易举。假设数据框中的列名为UCPTlogTimeAddressMeterUCPTvalue

library(dplyr)

data <- data %>% group_by(Address, Meter) %>% 
  mutate(delta = order_by(UCPTlogTime, UCPTvalue - lag(UCPTvalue))) %>%
  mutate(delta = ifelse(is.na(delta), 0, delta))

答案 2 :(得分:1)

这似乎更简单,其中diff是你想要计算的。

for (i in 1:nrow(t)){t$diff[i]<-t[i,3]-t[1,3]}
t
     v1     v2 v3 diff
1 Date1 MeterA 45    0
2 Date2 MeterB 45    0
3 Date3 MeterC 47    2
4 Date4 MeterD 54    9

答案 3 :(得分:1)

以下是使用dplyr的另一种方法 - 没有看到Address的变量,但您可以将其添加到group_by()

library(dplyr)

df <- data.frame(read_date = c("2016-11-03",
                               "2016-11-03",
                               "2016-11-04",
                               "2016-11-04"),
                 Meter = c("MeterA",
                           "MeterB",
                           "MeterA",
                           "MeterB"),
                 UCPTvalue = c(45,
                               45,
                               47,
                               54))

out <- df %>%
        group_by(Meter) %>%
        mutate(diff = ifelse(row_number() == 1,
                             0,
                             UCPTvalue - lag(UCPTvalue, 1)))