根据状态计算等级的平均值

时间:2017-12-25 12:35:00

标签: r

计算平均值我的数据如下所示

    -----------
    level   sts
    -----------
    10      s
    -----------
    11      s
    -----------
    10      s
    -----------
    10      s
    -----------
    10      s
    -----------
    9       r
    -----------
    8.5     r
    -----------
    8       s
    -----------
    8.1     s
    -----------
    8       s
    -----------

根据sts计算平均值(s =停止,r =运行)。我想输出像这样

    -----------
    level   sts
    -----------
    10.2     s
    -----------
    9        r
    -----------
    8.5      r
    -----------
    8.03     s
    -----------

最后,输出将如下所示

    -----------
    level   sts
    -----------
    10.2    s
    -----------
    10.2    s
    -----------
    10.2    s
    -----------
    10.2    s
    -----------
    10.2    s
    -----------
    9       r
    -----------
    8.5     r
    -----------
    8.03    s
    -----------
    8.03    s
    -----------
    8.03    s
    ---------

如果已有答案,请提供链接,谢谢

1 个答案:

答案 0 :(得分:1)

根据您想要的输出,我会尝试类似:

library(data.table)
setDT(mydf)[, group := rleid(sts)][
  sts == "s", level := mean(level), .(sts, group)][]
#         level sts group
#  1: 10.200000   s     1
#  2: 10.200000   s     1
#  3: 10.200000   s     1
#  4: 10.200000   s     1
#  5: 10.200000   s     1
#  6:  9.000000   r     2
#  7:  8.500000   r     2
#  8:  8.033333   s     3
#  9:  8.033333   s     3
# 10:  8.033333   s     3

我认为对于“tidyverse”,等价物应该是这样的:

library(tidyverse)
library(data.table) # for `rleid`

mydf %>%
  mutate(group = rleid(sts)) %>%
  group_by(sts, group) %>%
  mutate(level = case_when(
    sts == "s" ~ mean(level),
    TRUE ~ level
  ))

示例数据:

mydf <- structure(list(level = c(10, 11, 10, 10, 10, 9, 8.5, 8, 8.1, 
    8), sts = c("s", "s", "s", "s", "s", "r", "r", "s", "s", "s")),
    .Names = c("level", "sts"), row.names = c(NA, 10L), class = "data.frame")