分组数据的百分比变化:根据组的第一个值进行计算

时间:2019-03-20 18:11:46

标签: r dplyr mutate

我试图获取组中(一个变量的第一个值)与同一组中其他每个变量(同一变量)之间的百分比变化。

示例数据:

df = data.frame(group = c(rep('A',4), rep('B',3)),
            response = c(1,4,2,1,1,2,3),
            treatment = c("control","100mg","200mg","50mg","control","100mg","200mg"))

> df
    group response treatment
       A     1   control
       A     4     100mg
       A     2     200mg
       A     1      50mg
       B     1   control
       B     2     100mg
       B     3     200mg

换句话说,我想获得百分比变化 同一组中所有其他水平的治疗的相对于治疗“控制”的响应治疗的级别数可能因组而异。

到目前为止我所拥有的:

# function for % change
pct <- function(x) {(x/lag(x)-1)*100}

library(dplyr)
# group data and apply function
percChange <- df %>% 
  group_by(group) %>% 
  mutate_at(vars(response), funs(pct))

# the output (perChange) is:

#   group response treatment
# 1 A        NA   control  
# 2 A       300   100mg    
# 3 A       -50   200mg    
# 4 A       -50   50mg     
# 5 B        NA   control  
# 6 B       100   100mg    
# 7 B        50   200mg

但是我想要的输出是:

# group  response  treatment
# 1 A        NA   control  
# 2 A       300   100mg    
# 3 A       100   200mg    
# 4 A       0     50mg     
# 5 B       NA    control  
# 6 B       100   100mg    
# 7 B       200   200mg

我到处都看过,发现了类似的东西,但没有一个是我所追求的。谢谢。

2 个答案:

答案 0 :(得分:2)

您要使用first()

library(tidyverse)

df = data.frame(
  group = c(rep('A',4), rep('B',3)),
  response = c(1,4,2,1,1,2,3),
  treatment = c("control","100mg","200mg","50mg","control","100mg","200mg")
)

df %>%
  group_by(group) %>%
  mutate(
    resp_pct_chg_from_first = (response / first(response) - 1) * 100
  )
#> # A tibble: 7 x 4
#> # Groups:   group [2]
#>   group response treatment resp_pct_chg_from_first
#>   <fct>    <dbl> <fct>                       <dbl>
#> 1 A            1 control                         0
#> 2 A            4 100mg                         300
#> 3 A            2 200mg                         100
#> 4 A            1 50mg                            0
#> 5 B            1 control                         0
#> 6 B            2 100mg                         100
#> 7 B            3 200mg                         200

reprex package(v0.2.1)于2019-03-20创建

答案 1 :(得分:0)

JasonAizkalns的回答很好,但以防万一您想保留pct函数。只需修复pct函数中的一个小错误即可使其正常工作。

pct <- function(x) {
  ((x-x[1])/x[1]) * 100
}

> percChange
# A tibble: 7 x 3
# Groups:   group [2]
  group response treatment
  <fct>    <dbl> <fct>    
1 A            0 control  
2 A          300 100mg    
3 A          100 200mg    
4 A            0 50mg     
5 B            0 control  
6 B          100 100mg    
7 B          200 200mg