Question

我正在尝试为R中的临床测试编写功能代码。我的R技能非常生疏，我真的很感激任何帮助。

我试图编写的功能需要31个值（临床测试中有31个问题，患者填写）。然后分别对这31个值进行评分（大多数问题具有不同的范围），然后将它们组合在一起以获得不同参数的加权平均值。

得分范围：

表示Q 1（定义为x1） - 将响应乘以10

对于问题2,6,5,9 - （以6分为单位）得分为
1 - 100
2 - 80
3 - 60
4 - 40
5 - 20
6 - 0.

对于问题3,4,7,8,10,11,12,13,16,17,18（按6分进行评分）
1 - 0
2 - 20
3 - 40
4 - 60
5 - 80
6 - 100

对于Q14,25,26,27,28,29,30（按5的等级评分）
1 - 100
2 - 75
3 - 50
4 - 25
5 - 0

表示Q 19,20（按5分的评分）
1 - 0
2 - 25
3 - 50
4 - 75
5 - 100

对于Q 15,21,23,24（按4的等级评分）
1 - 0
2 - 33.3
3 - 66.7
4 - 100

第22页

1 - 0
2 - 50
3 -100

qolie31 <- function(x1, x2, x3, ...){
  x1a <- x1*10 
  z <- c(x2, x5, x6, x9)  
  {for (i in z){
    if (i==1){x == 100}
    else if(i==2){x == 80}
    else if(i==3){x==60}
    else if(i==4){x==40}
    else if(i==5){x==20}
    else (i==6){x==0}
    z2 <- x
  }
}

我的问题：

我在第一行代码中使用了...函数来定义我需要从x1到x31的参数。我的最终目标不是从1到31手动定义它们。请有人告诉我如何定义从x1到x31的参数，而无需在那里手动编写
如何在功能中保存新分数，以便我以后可以使用它进行分析？

Answer 1

通常，您可以使用...使用list(...)捕获任意数量的参数。请参阅this other question中的详情。但是，当您认为您不知道将提供多少参数并且您希望无论如何都能够处理它时，这通常是最好的。在这种情况下，您知道应该有31个答案，因此...不合适。相反，您应该尝试将您的答案存储在长度为31的向量中，并将其作为参数提供。以下示例。在这里，我根据您制定的规则创建简短的oneliner来转换每个答案组。这利用了R的数学函数，我认为它比使用if语句更清晰（也更快？）。然后我们将转换应用于每组答案并将它们分配给输出分数。显示一些随机答案1-3的示例。

如果您担心拼写错误是一个问题，我使用assert_that添加了一些注释代码来检查错误。您可以在每个score_函数内部检查答案是否在正确的范围内，例如问题22的答案不应该具有值4.

对于最后一部分，您不需要在函数中包含赋值。只需确保它返回您想要的内容并在调用该函数时执行赋值，如下所示。

eg_ans <- sample.int(3, 31, replace = TRUE)

transform_scores <- function(answers){
  # assertthat::assert_that(
  #   length(answers) == 31,
  #   msg = "There are not 31 values in input vector"
  # )
  score1 <- function(ans) ans * 10
  score6a <- function(ans) (6 - ans) * 20
  score6b <- function(ans) (ans - 1) * 20
  score5a <- function(ans) (5 - ans) * 25
  score5b <- function(ans) (ans - 1) * 25
  score4 <- function(ans) (ans - 1) * (100 / 3)
  score3 <- function(ans) (ans - 1) * 50

  scores <- numeric(31)
  scores[1] <- score1(answers[1])
  scores[c(2, 5:6, 9)] <- score6a(answers[c(2, 5:6, 9)])
  scores[c(3:4, 7:8, 10:13, 16:18)] <- score6b(answers[c(3:4, 7:8, 10:13, 16:18)])
  scores[c(14, 25:30)] <- score5a(answers[c(14, 25:30)])
  scores[19:20] <- score5b(answers[19:20])
  scores[c(15, 21, 23:24)] <- score4(answers[c(15, 21, 23:24)])
  scores[22] <- score3(answers[22])
  return(scores)
}

eg_scores <- transform_scores(eg_ans)
eg_scores
#>  [1]  30.00000  60.00000   0.00000  20.00000 100.00000 100.00000   0.00000
#>  [8]  20.00000  60.00000  20.00000   0.00000  40.00000   0.00000  75.00000
#> [15]  66.66667   0.00000   0.00000  20.00000  50.00000  50.00000  66.66667
#> [22] 100.00000   0.00000  33.33333 100.00000  75.00000 100.00000 100.00000
#> [29] 100.00000  50.00000   0.00000

由reprex package（v0.2.0）创建于2018-04-24。

Answer 2

您可以使用mapvalues包中的plyr功能。

    rescaleq<- function(x){
    require(plyr)
    if (length(x) != 30) stop("Vector of 30 elements required")
    x[1]<- x[1]*10
    x[c(2, 5, 6, 9)]<- mapvalues(x[c(2, 5, 6, 9)], from = 1:6, to = seq(100, 0, by = -20))
    x[c(3,4,7,8,10,11,12,13,16,17,18)]<- mapvalues(x[c(3,4,7,8,10,11,12,13,16,17,18)], from  = 1:6, to = seq(0, 100, by = 20))
    x[c(14, 25, 26, 27, 28, 29, 30)]<- mapvalues(x[c(14, 25, 26, 27, 28, 29, 30)], from = 1:5, to = seq(100, 0, by = -25))
    x[c(19, 20)]<- mapvalues(x[c(19, 20)], from = 1:5, to = seq(0, 100, by = 25))
    x[c(5, 21, 23, 24)]<- mapvalues(x[c(5, 21, 23, 24)], from = 1:4, to = seq(0, 100, length.out = 4))
     x[22]<- mapvalues(x[22], from = 1:3, to = seq(0, 100, by = 50))
    return(round(x, 2))
}

并用一些数据来测试它：

> xvector <- sample.int(3, 31, replace=T)
> xvector
# [1] 2 1 3 2 2 3 2 1 1 3 1 3 1 1 1 1 2 1 3 1 1 2 1 1 2 2 3 1 3 3 
> rescaleq(xvector[-31]) # Note that below, these are messages NOT errors or warnings
#The following `from` values were not present in `x`: 4, 5, 6
#The following `from` values were not present in `x`: 4, 5, 6
#The following `from` values were not present in `x`: 4, 5
#The following `from` values were not present in `x`: 2, 4, 5
#The following `from` values were not present in `x`: 3, 4
#The following `from` values were not present in `x`: 1, 3
# [1]  20.00 100.00  80.00  60.00 100.00  40.00  20.00  20.00   0.00  40.00   0.00  40.00
#[13]   0.00   0.00  20.00   0.00 100.00  75.00  75.00  50.00 100.00  50.00  50.00  50.00
#[25]   0.00  33.33   0.00   0.00   0.00  50.00

如果您要删除mapvalues生成的消息，请尝试在其周围包裹suppressMessages，即suppressMessages(mapvalues(x[c(2, 5, 6, 9)], from = 1:6, to = seq(100, 0, by = -20)))等。

Answer 3

另一种方式，这次使用tidyverse和查找表：

library(tidyverse)

data = "
1                             | 10
2,6,5,9                       | 100,80,60,40,20,0
3,4,7,8,10,11,12,13,16,17,18  | 0,20,40,60,80,100
14, 25, 26, 27, 28, 29, 30    | 100,75,50,25,0
19,20                         | 0,25,59,75,100
15, 21, 23, 24                | 0, 33.3, 66.7, 100
22                            | 0,50,100
"

df <- read.table(text = data, sep = '|', 
                 stringsAsFactors = F, 
                 col.names = c('q', 'factor'),
                 strip.white = T)

# create the lookup table
# save it somewhere
# as we only need to generate it once
lookup <- df %>%
  separate_rows(q, sep = ',') %>%
  separate_rows(factor, sep = ',', convert = T) %>%
  group_by(q) %>%
  mutate(item = 1:n()) %>%
  ungroup()

# calculate the score
calc_score <- function(x) {
  score <- 0
  for (i in seq_along(x)) {
    f <- lookup %>% filter(q == i, item == x[i]) %>% select(factor) %>% pull()
    score <- score + i * f
  }
  score
}

v <- c(1,4,3)
(score <- calc_score(v))

这个例子得分为210.

R中的嵌套For和If循环

3 个答案: