根据其他两个值创建一个新因子

时间:2020-09-15 07:33:38

标签: r

dput(dta[1:4, ])
structure(list(A = c(112551L, 112551L, 112551L, 112551L), set1.RTTime = 
c(349897L, 
349897L, 349897L, 349897L), set2.RTTime = c(592639L, 592639L, 
592639L, 592639L), set3.RTTime = c(840648L, 840648L, 840648L, 
840648L), set4.RTTime = c(1082053L, 1082053L, 1082053L, 1082053L
), set5.RTTime = c(1322732L, 1322732L, 1322732L, 1322732L), set6.RTTime = 
c(1559749L, 
1559749L, 1559749L, 1559749L), set7.RTTime = c(1802346L, 1802346L, 
1802346L, 1802346L), set8.RTTime = c(2041123L, 2041123L, 2041123L, 
2041123L), set9.RTTime = c(2278899L, 2278899L, 2278899L, 2278899L
), Tokens.RESP = structure(c(1L, 3L, 4L, 2L), .Label = c("A", 
"B", "C", "D"), class = "factor"), var.Block. = c(3000L, 1500L, 
1500L, 3000L)), row.names = c(NA, 4L), class = "data.frame") 

我正在尝试创建一个新的因素,该因素取决于其他两个因素的水平。每个受试者每个试验的一个名为“ Tokens.RESP”的因子取A,B,C或D的值。另一个因素称为“变量块”。在每个试验中可以为负值或正值,也可以为零(例如,受试者获得-1500个硬币或3000个硬币,或获得0个硬币)。我想创建一个名为“条件”的新因子,该因子将在每次试验中采用6种不同的可能值:

  1. 如果某人选择了A或B并获得了金币,则条件为“ risky_gain”,
  2. 如果受试者选择了A或B并丢了硬币,则条件为“ risky_loss”,
  3. 如果某个对象选择了A或B并获得了0个硬币,则条件为“ risky_neutral”,
  4. 如果受试者选择了C或D并获得了金币,则条件为“ safe_gain”,
  5. 如果受试者选择了C或D并丢了硬币,则条件为“ safe_loss”,
  6. 如果某人选择C或D并获得了0个硬币,则条件为“安全中立”。

1 个答案:

答案 0 :(得分:0)

对于基本R方法,您可以尝试:

dta$risk <- ifelse(dta$Tokens.RESP == "A" | dta$Tokens.RESP == "B", "risky", "safe")
dta$change <- ifelse(dta$var.Block. > 0, "gain", ifelse(dta$var.Block. < 0, "loss", "neutral"))
dta$condition <- factor(paste(dta$risk, dta$change, sep = "_"))

这包括确定风险与安全的两个中间步骤,然后是损益/中立,然后将这两个步骤结合起来。创建factor时,您还可以定义级别以确保代表所有级别。

另一种选择是dplyrcase_when

library(dplyr)

dta %>%
  mutate(condition = factor(case_when(
    (Tokens.RESP == "A" | Tokens.RESP == "B") & var.Block. > 0 ~ "risky_gain",
    (Tokens.RESP == "A" | Tokens.RESP == "B") & var.Block. < 0 ~ "risky_loss",
    (Tokens.RESP == "A" | Tokens.RESP == "B") & var.Block. == 0 ~ "risky_neutral",
    (Tokens.RESP == "C" | Tokens.RESP == "D") & var.Block. > 0 ~ "safe_gain",
    (Tokens.RESP == "C" | Tokens.RESP == "D") & var.Block. < 0 ~ "safe_loss",
    (Tokens.RESP == "C" | Tokens.RESP == "D") & var.Block. == 0 ~ "safe_neutral"
  ), levels = c("risky_gain", "risky_loss", "risky_neutral", "safe_gain", "safe_loss", "safe_neutral")))

这将明确定义每个因素和水平。

或者您可以使用fcase中的data.table

library(data.table)

setDT(dta)

dta[ , condition := fcase(
  (Tokens.RESP == "A" | Tokens.RESP == "B") & var.Block. > 0, "risky_gain",
  (Tokens.RESP == "A" | Tokens.RESP == "B") & var.Block. < 0, "risky_loss",
  (Tokens.RESP == "A" | Tokens.RESP == "B") & var.Block. == 0, "risky_neutral",
  (Tokens.RESP == "C" | Tokens.RESP == "D") & var.Block. > 0, "safe_gain",
  (Tokens.RESP == "C" | Tokens.RESP == "D") & var.Block. < 0, "safe_loss",
  (Tokens.RESP == "C" | Tokens.RESP == "D") & var.Block. == 0, "safe_neutral")]