根据汇总计数计算序列

时间:2019-04-23 12:48:26

标签: r

我正在尝试从以下数据构建一系列行/值:

# A tibble: 4 x 2
  year_row breaks
  <chr>     <int>
1 2015          7
2 2016          6
3 2017          5
4 2018          5

那是

  

7 + 6 = 13

     

+5 = 18

     

+5 = 23

预期输出:

2015     1:7
2016     8:13
2017     14:18
2018     19:23

以后我可以在某些功能/循环中使用这些序列

数据:

structure(list(year_row = c("2015", "2016", "2017", "2018"), 
    breaks = c(7L, 6L, 5L, 5L)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L))

3 个答案:

答案 0 :(得分:6)

我们将'breaks'和lag的'breaks'进行累加,然后进行paste

library(dplyr)
library(stringr)
df1 %>% 
   mutate(new = cumsum(breaks), 
          new2 =cumsum( lag(breaks, default = 0)) + 1) %>%
   transmute(year_row, new3 = str_c(new2, new, sep=":"))
# A tibble: 4 x 2
#  year_row new3 
#  <chr>    <chr>
#1 2015     1:7  
#2 2016     8:13 
#3 2017     14:18
#4 2018     19:23

答案 1 :(得分:6)

基于R的想法

v1 <- cumsum(df$breaks)
v2 <- c(1, v1+1)
paste(v2[-length(v2)], v1, sep = ':')
#[1] "1:7"   "8:13"  "14:18" "19:23"

如果您想将它们作为实际向量,则可以使用Map

假设我们已经构造了v1v2  如上所示,

Map(`:`, v2[-length(v2)], v1)
#[[1]]
#[1] 1 2 3 4 5 6 7

#[[2]]
#[1]  8  9 10 11 12 13

#[[3]]
#[1] 14 15 16 17 18

#[[4]]
#[1] 19 20 21 22 23

将其附加到数据框,

df$ranges <- Map(`:`, v2[-length(v2)], v1)
df
# A tibble: 4 x 3
#  year_row breaks ranges   
#  <chr>     <int> <list>   
#1 2015          7 <int [7]>
#2 2016          6 <int [6]>
#3 2017          5 <int [5]>
#4 2018          5 <int [5]>

答案 2 :(得分:3)

使用与@akrun相同的基本思想,但不使用lag()

df %>%
 mutate(res = cumsum(breaks),
        res = paste((res - breaks) + 1, res, sep = ":"))

  year_row breaks   res
1     2015      7   1:7
2     2016      6  8:13
3     2017      5 14:18
4     2018      5 19:23

base R相同:

res <- cumsum(df$breaks)
df$res <- paste((res - df$breaks) + 1, res, sep = ":")

或者如果您希望将其作为实际矢量:

df %>%
 mutate(res1 = cumsum(breaks),
        res2 = (res1 - breaks) + 1) %>%
 rowwise() %>%
 mutate(res = list(res2:res1)) %>%
 select(-res1, -res2)

  year_row breaks res      
     <int>  <int> <list>   
1     2015      7 <int [7]>
2     2016      6 <int [6]>
3     2017      5 <int [5]>
4     2018      5 <int [5]>