create a dataframe or tibble based on values in different rows in another column

时间:2018-08-22 13:52:44

标签: r tidyverse

I have a column of event time offsets (ms) like this (but much bigger)

> ts_data = tibble( t = c(34, 78, 111, 165, 189))
> ts_data
# A tibble: 5 x 1
      t
  <dbl>
1    34
2    78
3   111
4   165
5   189

and I'd like to create a second column where the value in each row is the difference between the current row and the previous one (assuming t=0 at the start). So (by hand) for the above data I want to end up with this ..

> add_column(ts_data, t_int = c(34, 44, 33, 54, 24))
# A tibble: 5 x 2
      t t_int
  <dbl> <dbl>
1    34    34
2    78    44
3   111    33
4   165    54
5   189    24

i.e. 44 = 78-34; 33 = 111-78,...

I could do something with a loop but was sort of expecting that there might be a neater way using relative indexing however my quest to date has yet to bear fruit.

Any pointers would be appreciated :-)

1 个答案:

答案 0 :(得分:3)

An easier option with diff which returns a vector of length one less than the original vector (or column). So, append the first value of 't' to create the length equal as that of the original column

library(dplyr)
ts_data %>% 
   mutate(t_int = c(first(t), diff(t)))
# A tibble: 5 x 2
#      t t_int
#  <dbl> <dbl>
#1    34    34
#2    78    44
#3   111    33
#4   165    54
#5   189    24

Or take the difference of the original column with the lag of the column specifying the default as 0 (by default it is NA)

ts_data %>%
      mutate(t_int = t - lag(t, default = 0))