如何在满足条件时更新当前行的下一行

时间:2018-01-04 10:04:54

标签: r dataframe datatable dplyr

我有一个数据表如下:

library(data.table)
library(lubridate)

dput(data)
structure(list(Id = c(1, 1, 1, 1), start = structure(c(1509525095, 
1509529535, 1509532655, 1509543455), class = c("POSIXct", "POSIXt"
), tzone = "NA"), end = structure(c(1509525450, 1509529535, 1509535650, 
1509549450), class = c("POSIXct", "POSIXt"), tzone = "NA"), spot = structure(c(1509524490, 
1509529235, 1509529715, 1509542250), class = c("POSIXct", "POSIXt"
), tzone = "NA"), type = structure(c(1L, 1L, 3L, 1L), .Label = c("1", 
"2", "3"), class = "factor"), consumption = structure(c(10.0833333333333, 
5, 49, 20.0833333333333), units = "mins", class = "difftime")), .Names = c("Id", 
"start", "end", "spot", "type", "consumption"), row.names = c(NA, 
-4L), class = c("data.table", "data.frame"))

由此我想在之后添加一个新列 spot_new   在条件 start = end 结束的地方。

我试过

  data[start=end, data:=c(NA, spot[-.N]), by=Id]

但这不符合我的要求。感谢任何帮助。

所需输出

Desired Output

2 个答案:

答案 0 :(得分:3)

我可以提供dplyr解决方案,该解决方案适用于rowwiseif else语句,以便使用spot填充列。然后我们使用lag将它移动到一个位置,即

library(dplyr)

df %>% 
 group_by(Id) %>% 
 rowwise() %>% 
 mutate(spot_new = if(start == end){spot}else(NA)) %>% 
 ungroup() %>% 
 mutate(spot_new = lag(spot_new))

给出了

# A tibble: 4 x 7
     Id               start                 end                spot   type   consumption            spot_new
  <dbl>              <dttm>              <dttm>              <dttm> <fctr>        <time>              <dttm>
1     1 2017-11-01 08:31:35 2017-11-01 08:37:30 2017-11-01 08:21:30      1 10.08333 mins                  NA
2     1 2017-11-01 09:45:35 2017-11-01 09:45:35 2017-11-01 09:40:35      1  5.00000 mins                  NA
3     1 2017-11-01 10:37:35 2017-11-01 11:27:30 2017-11-01 09:48:35      3 49.00000 mins 2017-11-01 09:40:35
4     1 2017-11-01 13:37:35 2017-11-01 15:17:30 2017-11-01 13:17:30      1 20.08333 mins                  NA

答案 1 :(得分:1)

这里我们通过向其添加1获得下一行.I的行索引。为了处理群组的最后一行有“开始”的边缘情况。并且&#39;结束&#39;如果相同,请使用pmin获取最后一行(虽然不清楚该怎么做)

 i1 <- data[, .I[pmin(which(start == end)+1, .N)], Id]$V1
 data[i1, spot_new := spot][]
# Id               start                 end                spot type   consumption            spot_new
#1:  1 2017-11-01 08:31:35 2017-11-01 08:37:30 2017-11-01 08:21:30    1 10.08333 mins                <NA>
#2:  1 2017-11-01 09:45:35 2017-11-01 09:45:35 2017-11-01 09:40:35    1  5.00000 mins                <NA>
#3:  1 2017-11-01 10:37:35 2017-11-01 11:27:30 2017-11-01 09:48:35    3 49.00000 mins 2017-11-01 09:48:35
#4:  1 2017-11-01 13:37:35 2017-11-01 15:17:30 2017-11-01 13:17:30    1 20.08333 mins                <NA>