如何在dplyr中使用recode_factor来重新编码多个因子值?

时间:2018-04-03 15:16:44

标签: r dplyr tidyverse recode

     countrycode event
1713         ESP 110mh
1009         NED    HJ
536          BLR    LJ
2882         FRA 1500m
509          EST    LJ
2449         BEL    PV
1022         EST    HJ
2530         USA    JT
2714         CUB    JT
1236         HUN  400m
238          BLR  100m
2518         USA    JT
1575         FRA 110mh
615          JPN    LJ
1144         GER    HJ
596          CAN    LJ
2477         HUN    JT
1046         GER    HJ
2501         FIN    DT
2176         KAZ    PV

我想在我的数据框eventtype中创建一个新的因子向量,其中:

100m变量中400m110mh1500mevent的行被归为Runs; DTSPJT被归为Throws; LJHJPV被归为Jumps

我可以单独创建一个新的向量值,例如df$eventtype <- recode_factor(df$event, `100m`="Running")适用于一个事件,但我查看了文档,并且没有一种简单的方法可以在一个函数调用中转换多个值。

编辑:当然,如果有另一个功能更符合我的目的,我将使用它。

2 个答案:

答案 0 :(得分:2)

...函数的recode_factor参数可以使用任意数量的参数......

library(dplyr)

df <- read.table(header = T, text = "
number countrycode event
1713         ESP 110mh
1009         NED    HJ
536          BLR    LJ
2882         FRA 1500m
509          EST    LJ
2449         BEL    PV
1022         EST    HJ
2530         USA    JT
2714         CUB    JT
1236         HUN  400m
238          BLR  100m
2518         USA    JT
1575         FRA 110mh
615          JPN    LJ
1144         GER    HJ
596          CAN    LJ
2477         HUN    JT
1046         GER    HJ
2501         FIN    DT
2176         KAZ    PV
")

df$eventtype <- recode_factor(df$event, `100m` = "Runs", `400m` = "Runs", 
                              `110mh` = "Runs", `1500m` = "Runs", 
                              DT = "Throws", SP = "Throws", JT = "Throws",
                              LJ = "Jumps", HJ = "Jumps", PV = "Jumps")

# or inside a mutate command
df %>% 
  mutate(eventtype = recode_factor(event, `100m` = "Runs", `400m` = "Runs", 
                                   `110mh` = "Runs", `1500m` = "Runs", 
                                   DT = "Throws", SP = "Throws", JT = "Throws",
                                   LJ = "Jumps", HJ = "Jumps", PV = "Jumps"))

答案 1 :(得分:1)

ifelse就是您所需要的。以下是一些示例代码,因为您没有可重复的示例。

countycode = c("ESP", "HUN", "KAZ")
event = c("100m", "JT", "PV")
data = as.data.frame(cbind(countycode,event))

# generate the recode groups.
runs = c("100m", "400m", "1500m")
throws = c("JT", "SP")
jumps = c("HJ", "PV")

# add another column.
data$eventtype = ifelse(data$event %in% runs, "Runs", 
                        ifelse(data$event %in% throws, "Throws",
                              ifelse(data$event %in% jumps, "Jumps",
                                     NA)))

跑完后:

> data
  countycode event eventtype
1        ESP  100m      Runs
2        HUN    JT    Throws
3        KAZ    PV     Jumps