R-如果+动态创建列名,则进行突变

时间:2019-05-12 22:06:35

标签: r dplyr

我的df看起来像这样:

df <- read.table(text="
   expenses     month     paid_gas   paid_fees  paid_hotel   name
   100          2019-01   20         70         10           Jack Carver
   200          2019-02   40         140        20           Jack Carver
", header=TRUE)

我想计算带前缀的每个列创建多少费用列。换句话说,我想创建这样的东西:

result <- 
  mutate(
    prc_gas = paid_gas/expenses
)

但是我不想为每个列手动执行操作,因为我的df有数十个pay_列+新创建的列的名称应始终在前缀之后为文本。所以结果应该是

 result  <- read.table(text="
       expenses     month     paid_gas   paid_fees  paid_hotel   name           prc_gas    prc_fees   prc_hote
       100          2019-01   20         70         10           Jack Carver    20         70         10     
       200          2019-02   40         140        20           Jack Carver    20         70         10     
    ", header=TRUE) 

3 个答案:

答案 0 :(得分:3)

我们可以将mutate_at与命名为list的函数一起使用,以自动创建新列

library (dplyr) # for mutate_at()

df %>% mutate_at(vars(starts_with("paid")), list(prc = ~. / expenses))
#  expenses   month paid_gas paid_fees paid_hotel        name paid_gas_prc
#1      100 2019-01       20        70         10 Jack Carver          0.2
#2      200 2019-02       40       140         20 Jack Carver          0.2
#  paid_fees_prc paid_hotel_prc
#1           0.7            0.1
#2           0.7            0.1

请注意,示例数据df中缺少一些单个刻度。


样本数据

df <- read.table(text="expenses     month     paid_gas   paid_fees  paid_hotel   name
  100          2019-01   20         70         10           'Jack Carver'
  200          2019-02   40         140        20           'Jack Carver'", header=TRUE)

答案 1 :(得分:1)

我们还可以使用基数R b -> [a] -> [b]来计算多列

lapply

或与inds <- grep("^paid", names(df), value = TRUE) df[paste0("perc_", inds)] <- lapply(df[inds], function(x) x/df$expenses) # expenses month paid_gas paid_fees paid_hotel name #1 100 2019-01 20 70 10 Jack Carver #2 200 2019-02 40 140 20 Jack Carver # perc_paid_gas perc_paid_fees perc_paid_hotel # 0.2 0.7 0.1 # 0.2 0.7 0.1

mapply

答案 2 :(得分:0)

这里是data.table

的一个选项
library(data.table)
nm1 <- startsWith(names(df), "paid")
setDT(df)[, paste0("perc_", names(df)[nm1]) :=
             lapply(.SD, `/`, expenses), .SDcols = nm1]
df
#   expenses   month paid_gas paid_fees paid_hotel        name perc_paid_gas perc_paid_fees perc_paid_hotel
#1:      100 2019-01       20        70         10 Jack Carver           0.2            0.7             0.1
#2:      200 2019-02       40       140         20 Jack Carver           0.2            0.7             0.1