我有一些调查数据导致5分的浓度。但是,在某些响应列中,缺少某些因素。这是数据:
提高学生参与度,教学时间效率 增加,提高学生的信心,提高学生的表现 在课堂作业中,增加了学生的学习,增加了独特性 学习活动
非常同意,非常同意,非常同意,非常同意,强烈 同意,非常同意
既不同意也不反对,既不同意也不反对,都不同意 也不同意,既不同意也不反对,既不同意也不同意 不同意,既不同意也不反对
不同意,非常不同意,既不同意也不同意 不同意,不同意,不同意,既不同意也不反对
如您所见,某些响应列有一些缺失因素,例如:在第一列,同意和强烈不同意见缺失(为简单起见,我已粘贴实际数据集的一部分)
我在R中使用以下代码:
facultyData <- read_excel("FacultyResponsesForR.xlsx")
facultyData[] <- lapply( facultyData, factor)
facultyData[1:6] <- lapply( facultyData[1:6], factor, levels=1:5)
likertData <- likert(facultyData, nlevels = 5)
plot(likertData)
但是,这会导致以下错误:
Error in mean(as.numeric(items[, i]), na.rm = TRUE) :
(list) object cannot be coerced to type 'double'
我已经尝试了其他帖子中提到的解决方案(代码注释facultyData[] <- lapply(facultyData[], factor, levels=1:5)
中的那个),但它不起作用
显然,在执行此lappy之前,数据包含:
# A tibble: 14 × 1
`Increased student engagement`
<fctr>
1 Strongly agree
2 Agree
3 Agree
4 Agree
5 Agree
6 Agree
7 Agree
8 Agree
9 Agree
10 Neither agree nor disagree
11 Neither agree nor disagree
12 Neither agree nor disagree
13 Neither agree nor disagree
14 Disagree
执行后,数据被NA值覆盖?为什么会这样?
> facultyData[1:6] <- lapply( facultyData[1:6], factor, levels=1:5)
> facultyData[,1]
# A tibble: 14 × 1
`Increased student engagement`
<fctr>
1 NA
2 NA
3 NA
4 NA
5 NA
6 NA
7 NA
8 NA
9 NA
10 NA
11 NA
12 NA
13 NA
14 NA
按如下方式更改代码后,数据会保留(不会变为NA,但我会收到同样的错误)
mylevels <- c('Strongly disagree', 'Disagree', 'Neither agree nor disagree', 'Agree', 'Strongly agree')
facultyData <- read_excel("FacultyResponsesForR.xlsx")
facultyData[] <- lapply( facultyData, factor)
facultyData[1:6] <- lapply( facultyData[1:6], factor, levels=mylevels)
此解决方案对我不起作用 - https://github.com/jbryer/likert/blob/master/demo/UnusedLevels.R
答案 0 :(得分:2)
重写你的数据并不好玩,这需要一点时间来弄清楚,但我认为这对你有帮助。有人可能会有更短的路要走。如果有帮助,请告诉我。
df <- rbind(c("Strongly agree","Strongly agree","Strongly agree","Strongly agree","Strongly agree","Strongly agree"),
c("Neither agree nor disagree","Neither agree nor disagree","Neither agree nor disagree","Neither agree nor disagree","Neither agree nor disagree","Neither agree nor disagree"),
c("Disagree","Strongly disagree","Neither agree nor disagree","Disagree","Disagree","Neither agree nor disagree"))
df <- as.data.frame(df)
colnames(df) <- c("Increased student engagement", "Instructional time effectiveness increased", "Increased student confidence", "Increased student performance in class assignments", "Increased learning of the students", "Added unique learning activities")
lookup <- data.frame(levels = 1:5, mylabels = c('Strongly disagree', 'Disagree', 'Neither agree nor disagree', 'Agree', 'Strongly agree'))
df.1 <- as.data.frame(apply(df, 2, function(x) match(x, lookup$mylabels)))
df.new <- as.data.frame(lapply(as.list(df.1), factor, levels = lookup$levels, labels = lookup$mylabels))
str(df.new)
'data.frame': 3 obs. of 6 variables:
$ Increased.student.engagement : Factor w/ 5 levels "Strongly disagree",..: 5 3 2
$ Instructional.time.effectiveness.increased : Factor w/ 5 levels "Strongly disagree",..: 5 3 1
$ Increased.student.confidence : Factor w/ 5 levels "Strongly disagree",..: 5 3 3
$ Increased.student.performance.in.class.assignments: Factor w/ 5 levels "Strongly disagree",..: 5 3 2
$ Increased.learning.of.the.students : Factor w/ 5 levels "Strongly disagree",..: 5 3 2
$ Added.unique.learning.activities : Factor w/ 5 levels "Strongly disagree",..: 5 3 3
答案 1 :(得分:2)
我使用您的示例数据创建了一个Excel文件。用addEntity(A.class)
读取此内容会得到如下结果
addEntity("a", A.class)
你是对的,read_excel
不会将字符变量转换为因子 - 这是故意的,因为将字符变量视为分类通常是不必要或不合适的。即使我们确实希望转换为因子,最好明确地这样做,以确保因子具有正确的级别,按正确的顺序(默认情况下,将使用变量中存在的级别创建因子,按字母顺序排序)。有时我们可能想要做更复杂的事情,如重命名级别或重新组合级别,但在这里我们不想更改级别,只需指定完整的级别集。创建所需因素的一种方法是使用 dplyr 中的library(readxl)
dat <- read_excel("factor_labels.xlsx")
dat
#> # A tibble: 3 × 6
#> `Increased student engagement`
#> <chr>
#> 1 Strongly agree
#> 2 Neither agree nor disagree
#> 3 Disagree
#> # ... with 5 more variables: `Instructional time effectiveness
#> # increased` <chr>, `Increased student confidence` <chr>, `Increased
#> # student performance in class assignments` <chr>, `Increased learning
#> # of the students` <chr>, `Added unique learning activities` <chr>
read_excel
请注意打印输出中从mutate_all
到mylevels <- c("Strongly disagree", "Disagree", "Neither agree nor disagree",
"Agree", "Strongly agree")
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
dat <- dat %>% mutate_all(factor, levels = mylevels)
dat
#> # A tibble: 3 × 6
#> `Increased student engagement`
#> <fctr>
#> 1 Strongly agree
#> 2 Neither agree nor disagree
#> 3 Disagree
#> # ... with 5 more variables: `Instructional time effectiveness
#> # increased` <fctr>, `Increased student confidence` <fctr>, `Increased
#> # student performance in class assignments` <fctr>, `Increased learning
#> # of the students` <fctr>, `Added unique learning activities` <fctr>
lapply(dat, levels)
#> $`Increased student engagement`
#> [1] "Strongly disagree" "Disagree"
#> [3] "Neither agree nor disagree" "Agree"
#> [5] "Strongly agree"
#>
#> $`Instructional time effectiveness increased`
#> [1] "Strongly disagree" "Disagree"
#> [3] "Neither agree nor disagree" "Agree"
#> [5] "Strongly agree"
#>
#> $`Increased student confidence`
#> [1] "Strongly disagree" "Disagree"
#> [3] "Neither agree nor disagree" "Agree"
#> [5] "Strongly agree"
#>
#> $`Increased student performance in class assignments`
#> [1] "Strongly disagree" "Disagree"
#> [3] "Neither agree nor disagree" "Agree"
#> [5] "Strongly agree"
#>
#> $`Increased learning of the students`
#> [1] "Strongly disagree" "Disagree"
#> [3] "Neither agree nor disagree" "Agree"
#> [5] "Strongly agree"
#>
#> $`Added unique learning activities`
#> [1] "Strongly disagree" "Disagree"
#> [3] "Neither agree nor disagree" "Agree"
#> [5] "Strongly agree"
的更改。将其与<chr>
解决方案进行比较:
<fctr>
由于子集中的变量不包含所有级别,因此级别数会有所不同,并且级别并不总是按逻辑顺序排列,这需要修复。这是一个常见的错误/挫折来源!