在R中建立平均尺度

时间:2016-10-15 20:11:32

标签: r scale

我试图分析一个由5个问题组成的纵向问卷。让我们说参与者要回答4次问卷。

现在,我想构建一个称为任务性能的比例。这个量表是按问题1,2和5的平均值建立的。

对于5个测量点中的每一个,我都需要这个意思。有没有比我在下面的例子中更容易计算这些方法的方法?

df$performance_T1 <- with(df, rowMeans(cbind(Task1T1,Task2T1,Task5T1), na.rm = TRUE))
df$performance_T2 <- with(df, rowMeans(cbind(Task1T2,Task2T2,Task5T2), na.rm = TRUE))
df$performance_T3 <- with(df, rowMeans(cbind(Task1T3,Task2T3,Task5T3), na.rm = TRUE))
df$performance_T4 <- with(df, rowMeans(cbind(Task1T4,Task2T4,Task5T4), na.rm = TRUE))

我的数据框在此示例中如下所示:

df = data.frame(Task1T1 <- c(1:5),
            Task2T1 <- c(1:5),
            Task3T1 <- c(1:5),
            Task4T1 <- c(1:5),
            Task5T1 <- c(1:5),
            Task1T2 <- c(1:5),
            Task2T2 <- c(1:5),
            Task3T2 <- c(1:5),
            Task4T2 <- c(1:5),
            Task5T2 <- c(1:5),
            Task1T3 <- c(1:5),
            Task2T3 <- c(1:5),
            Task3T3 <- c(1:5),
            Task4T3 <- c(1:5),
            Task5T3 <- c(1:5),
            Task1T4 <- c(1:5),
            Task2T4 <- c(1:5),
            Task3T4 <- c(1:5),
            Task4T4 <- c(1:5),
            Task5T4 <- c(1:5))

1 个答案:

答案 0 :(得分:0)

我认为在您的数据集中:

  1. 每一行都是一个人
  2. 每个“任务”和“测量”都在一列中。
  3. 我们可以使用var grid = document.getElementById("grid"); color_input = document.getElementById("color"); color_input.addEventListener("click", function() { var newdiv = document.createElement("div"); grid.append(newdiv); color_input.addEventListener("input", function() { newdiv.style.backgroundColor = color_input.value; }) }); 以及tidy data的原则来重组数据。

    在下面的示例中,为了清晰起见,我添加了一个明确的tidyverse

    person_id

    如果您希望数据以连续的一个人的格式返回,并且列中的任务平均值,则可以将其退回。

    library(tidyverse)
    
    df = structure(list(Task1T1 = 1:5, Task2T1 = 1:5, Task3T1 = 1:5, Task4T1 = 1:5, 
                        Task5T1 = 1:5, Task1T2 = 1:5, Task2T2 = 1:5, Task3T2 = 1:5, 
                        Task4T2 = 1:5, Task5T2 = 1:5, Task1T3 = 1:5, Task2T3 = 1:5, 
                        Task3T3 = 1:5, Task4T3 = 1:5, Task5T3 = 1:5, Task1T4 = 1:5, 
                        Task2T4 = 1:5, Task3T4 = 1:5, Task4T4 = 1:5, Task5T4 = 1:5), 
                   .Names = c("Task1T1", "Task2T1", "Task3T1", "Task4T1", "Task5T1", 
                              "Task1T2", "Task2T2", "Task3T2", "Task4T2", "Task5T2", 
                              "Task1T3", "Task2T3", "Task3T3", "Task4T3", "Task5T3", 
                              "Task1T4", "Task2T4", "Task3T4", "Task4T4", "Task5T4"), 
                   row.names = c(NA, -5L), class = "data.frame")
    
    df %>% 
        mutate(person_id = row_number()) %>%
        gather(task, score, -person_id) %>%
        separate(task, into = c("Task", "Measurement"), sep = "T(?=\\d)") %>%
        group_by(Task, person_id) %>%
        summarise(Performance = mean(score, na.rm = T)) ->
        perf
    
    head(perf)
    #> Source: local data frame [6 x 3]
    #> Groups: Task [2]
    #> 
    #>    Task person_id Performance
    #>   <chr>     <int>       <dbl>
    #> 1 Task1         1           1
    #> 2 Task1         2           2
    #> 3 Task1         3           3
    #> 4 Task1         4           4
    #> 5 Task1         5           5
    #> 6 Task2         1           1