R - 对likert包

时间:2015-09-16 15:13:25

标签: r

我正在使用Likert数据。我从数据框中删除了四列以下内容:

  

items< - df [,substr(names(df),1,11)==" RTAPOSTPrep"]

结果如下:

> items
   RTAPOSTPrep1_PDschool RTAPOSTPrep2_Pddistrict RTAPOSTPrep3_Pdregion RTAPOSTPrep4_PDnational
1    completely prepared     completely prepared   completely prepared     completely prepared
2    completely prepared           very prepared         very prepared           very prepared
3               prepared           very prepared   completely prepared     completely prepared
4          very prepared           very prepared         very prepared           very prepared
5                   <NA>                    <NA>                  <NA>                    <NA>
6    completely prepared     completely prepared   completely prepared     completely prepared
7    completely prepared     completely prepared   completely prepared     completely prepared
8    completely prepared     completely prepared         very prepared           very prepared
9    completely prepared     completely prepared   completely prepared     completely prepared
10         very prepared           very prepared         very prepared           very prepared
11   completely prepared     completely prepared         very prepared           very prepared
12   completely prepared     completely prepared   completely prepared     completely prepared
13   completely prepared           very prepared         very prepared           very prepared
14              prepared                prepared              prepared                prepared
15         very prepared           very prepared         very prepared           very prepared
16         very prepared           very prepared         very prepared           very prepared
17   completely prepared     completely prepared   completely prepared     completely prepared
18   completely prepared     completely prepared         very prepared           very prepared
19                  <NA>                    <NA>                  <NA>                    <NA>
20   completely prepared     completely prepared   completely prepared           very prepared
21         very prepared           very prepared         very prepared                prepared
22                  <NA>                    <NA>                  <NA>                    <NA>
23              prepared                prepared              prepared                prepared

数据看起来像存储为一个因素:

> str(items)
'data.frame':   23 obs. of  4 variables:
 $ RTAPOSTPrep1_PDschool  : Factor w/ 3 levels "completely prepared",..: 1 1 2 3 NA 1 1 1 1 3 ...
 $ RTAPOSTPrep2_Pddistrict: Factor w/ 3 levels "completely prepared",..: 1 3 3 3 NA 1 1 1 1 3 ...
 $ RTAPOSTPrep3_Pdregion  : Factor w/ 3 levels "completely prepared",..: 1 3 1 3 NA 1 1 3 1 3 ...
 $ RTAPOSTPrep4_PDnational: Factor w/ 3 levels "completely prepared",..: 1 3 1 3 NA 1 1 3 1 3 ...

我想使用套餐&#34; likert&#34;分析这些数据,但是当我这样做时,水平就会出现故障:

>likert(items)
                     Item completely prepared prepared very prepared
1   RTAPOSTPrep1_PDschool                  60       15            25
2 RTAPOSTPrep2_Pddistrict                  50       10            40
3   RTAPOSTPrep3_Pdregion                  40       10            50
4 RTAPOSTPrep4_PDnational                  35       15            50

我希望按以下顺序排列五个级别:完全不准备,准备一点,准备好,准备充分,准备充分。但当我试图操纵&#34;项目&#34;无论如何,我得到一个错误,说该命令仅用于因素。如果我使用$来拉出列(即项目$ RTAPOSTPrep1_PDschool),我可以操纵因子的级别,但我通常必须为几十列做这个,并且想要一种快速重新整理所有列的方法,以便它们在相同的顺序中都具有相同的五个级别。我最好的尝试是:

> apply(items,2,function(x) relevel(x, ref="prepared"))
Error in relevel.default(x, ref = "prepared") : 
  'relevel' only for factors

我怀疑我对因子的工作原理以及如何从数据框中提取数据的工作原理不太了解(我对R来说很新)。有人可以帮忙吗?我花了大量时间试图这样做。

2 个答案:

答案 0 :(得分:1)

首先,创建一个向量,按照您想要的顺序保存关卡:

lvl = c("not at all prepared", "a little prepared", "prepared", "very prepared", "completely prepared")

下面,我创建一个示例数据框,并显示级别无序:

d <- data.frame(a=sample(lvl,15, replace=T), b=sample(lvl,15, replace=T))
levels(d$a)

[1] "a little prepared"   "completely prepared" "not at all prepared" "very prepared"   

然后,使用lapply使用指定的级别重构每个列并分配回原始data.frame

d[] <- lapply(d, function(x) x = factor(x, levels=lvl))
levels(d$a)

[1] "not at all prepared" "a little prepared"   "prepared"            "very prepared"      
[5] "completely prepared"

答案 1 :(得分:0)

提取数据

我个人更喜欢dplyr而不是基础R:

library(dplyr)
df %>% 
  select(contains("RTAPOSTPrep")) # selects all the columns which contain "RTAPOSTPrep"

重新发展因素

Cookbook for R提供了很好的介绍。

您可以使用:

# sample data
var1 <- factor(c("not at all prepared", "prepared"))
var2 <- factor(c("prepared", "very prepared"))
df <- data.frame(var1, var2)
lapply(df, levels)
# $var1
# [1] "not at all prepared" "prepared"           

# $var2
# [1] "prepared"      "very prepared"


# create vector with correct order
levels <- c("not at all prepared", "a little prepared", "prepared",
            "very prepared", "completely prepared")

new_df <- lapply(df, function(x) factor(x, levels = levels)) %>% 
  as_data_frame 

lapply(new_df, levels)
# $var1
# [1] "not at all prepared" "a little prepared"   "prepared"            "very prepared"       "completely prepared"

# $var2
# [1] "not at all prepared" "a little prepared"   "prepared"            "very prepared"       "completely prepared"

<强>更新: 如果您不想要新的data.frame但想要对其进行修改,那么 pcantalupo 的方法效果很好:

df[] <- lapply(df, function(x) factor(x, levels = levels))