我正在使用Likert数据。我从数据框中删除了四列以下内容:
items< - df [,substr(names(df),1,11)==" RTAPOSTPrep"]
结果如下:
> items
RTAPOSTPrep1_PDschool RTAPOSTPrep2_Pddistrict RTAPOSTPrep3_Pdregion RTAPOSTPrep4_PDnational
1 completely prepared completely prepared completely prepared completely prepared
2 completely prepared very prepared very prepared very prepared
3 prepared very prepared completely prepared completely prepared
4 very prepared very prepared very prepared very prepared
5 <NA> <NA> <NA> <NA>
6 completely prepared completely prepared completely prepared completely prepared
7 completely prepared completely prepared completely prepared completely prepared
8 completely prepared completely prepared very prepared very prepared
9 completely prepared completely prepared completely prepared completely prepared
10 very prepared very prepared very prepared very prepared
11 completely prepared completely prepared very prepared very prepared
12 completely prepared completely prepared completely prepared completely prepared
13 completely prepared very prepared very prepared very prepared
14 prepared prepared prepared prepared
15 very prepared very prepared very prepared very prepared
16 very prepared very prepared very prepared very prepared
17 completely prepared completely prepared completely prepared completely prepared
18 completely prepared completely prepared very prepared very prepared
19 <NA> <NA> <NA> <NA>
20 completely prepared completely prepared completely prepared very prepared
21 very prepared very prepared very prepared prepared
22 <NA> <NA> <NA> <NA>
23 prepared prepared prepared prepared
数据看起来像存储为一个因素:
> str(items)
'data.frame': 23 obs. of 4 variables:
$ RTAPOSTPrep1_PDschool : Factor w/ 3 levels "completely prepared",..: 1 1 2 3 NA 1 1 1 1 3 ...
$ RTAPOSTPrep2_Pddistrict: Factor w/ 3 levels "completely prepared",..: 1 3 3 3 NA 1 1 1 1 3 ...
$ RTAPOSTPrep3_Pdregion : Factor w/ 3 levels "completely prepared",..: 1 3 1 3 NA 1 1 3 1 3 ...
$ RTAPOSTPrep4_PDnational: Factor w/ 3 levels "completely prepared",..: 1 3 1 3 NA 1 1 3 1 3 ...
我想使用套餐&#34; likert&#34;分析这些数据,但是当我这样做时,水平就会出现故障:
>likert(items)
Item completely prepared prepared very prepared
1 RTAPOSTPrep1_PDschool 60 15 25
2 RTAPOSTPrep2_Pddistrict 50 10 40
3 RTAPOSTPrep3_Pdregion 40 10 50
4 RTAPOSTPrep4_PDnational 35 15 50
我希望按以下顺序排列五个级别:完全不准备,准备一点,准备好,准备充分,准备充分。但当我试图操纵&#34;项目&#34;无论如何,我得到一个错误,说该命令仅用于因素。如果我使用$来拉出列(即项目$ RTAPOSTPrep1_PDschool),我可以操纵因子的级别,但我通常必须为几十列做这个,并且想要一种快速重新整理所有列的方法,以便它们在相同的顺序中都具有相同的五个级别。我最好的尝试是:
> apply(items,2,function(x) relevel(x, ref="prepared"))
Error in relevel.default(x, ref = "prepared") :
'relevel' only for factors
我怀疑我对因子的工作原理以及如何从数据框中提取数据的工作原理不太了解(我对R来说很新)。有人可以帮忙吗?我花了大量时间试图这样做。
答案 0 :(得分:1)
首先,创建一个向量,按照您想要的顺序保存关卡:
lvl = c("not at all prepared", "a little prepared", "prepared", "very prepared", "completely prepared")
下面,我创建一个示例数据框,并显示级别无序:
d <- data.frame(a=sample(lvl,15, replace=T), b=sample(lvl,15, replace=T))
levels(d$a)
[1] "a little prepared" "completely prepared" "not at all prepared" "very prepared"
然后,使用lapply
使用指定的级别重构每个列并分配回原始data.frame
d[] <- lapply(d, function(x) x = factor(x, levels=lvl))
levels(d$a)
[1] "not at all prepared" "a little prepared" "prepared" "very prepared"
[5] "completely prepared"
答案 1 :(得分:0)
我个人更喜欢dplyr
而不是基础R:
library(dplyr)
df %>%
select(contains("RTAPOSTPrep")) # selects all the columns which contain "RTAPOSTPrep"
Cookbook for R提供了很好的介绍。
您可以使用:
# sample data
var1 <- factor(c("not at all prepared", "prepared"))
var2 <- factor(c("prepared", "very prepared"))
df <- data.frame(var1, var2)
lapply(df, levels)
# $var1
# [1] "not at all prepared" "prepared"
# $var2
# [1] "prepared" "very prepared"
# create vector with correct order
levels <- c("not at all prepared", "a little prepared", "prepared",
"very prepared", "completely prepared")
new_df <- lapply(df, function(x) factor(x, levels = levels)) %>%
as_data_frame
lapply(new_df, levels)
# $var1
# [1] "not at all prepared" "a little prepared" "prepared" "very prepared" "completely prepared"
# $var2
# [1] "not at all prepared" "a little prepared" "prepared" "very prepared" "completely prepared"
<强>更新强>:
如果您不想要新的data.frame
但想要对其进行修改,那么 pcantalupo 的方法效果很好:
df[] <- lapply(df, function(x) factor(x, levels = levels))