R中生成因子的问题

时间:2017-03-12 18:37:58

标签: r statistics statsmodels

在使用melt()函数从宽格式转换为长格式后,我正在尝试将分类变量转换为R中的因子。但是,当我运行因子函数并输入级别和标签时,我得到一个表:

有谁知道为什么会这样?

law <- read.csv("lawyers_class_new.csv")


library(reshape2)
law <- melt(law, id.vars = c("Subj"), measure.vars = c("lawyer", "neutral", "engineer", "neutral_urb", "neutral_rur"))
law <- law[order(law$Subj),]
law <- within(law,
              Subj <- factor(Subj),
              variable <- factor(variable)
              )
law$variable<- ordered(law$variable,levels=c(1,2,3,4,5),labels=c("lawyer","neutral",
    "engineer","neutral_urb","neutral_rur"))


Output: 

law$variable
  [1] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>     <NA> <NA> <NA> <NA>
 [18] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
 [35] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
 [52] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
 [69] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
 [86] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
[103] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
[120] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
[137] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>

MELEDED DATA FRAME:

**Subj  Cond    variable    value**
1         2       lawyer      3
1         3      neutral      1
1         1      engineer     3.5
1         5      neutral_urb  3
1         4      neutral_rur  3.5
2         2      lawyer       1
2         3      neutral      3.5
2         1      engineer     4.5
2         5      neutral_urb  2
2         4      neutral_rur  3.5

原始数据框架:

Subj    lawyer  neutral engineer    neutral_urb neutral_rur
1          3       1      3.5           3          3.5
2          1     3.5      4.5           2          3.5

1 个答案:

答案 0 :(得分:0)

为了最大限度地减少错误,我不会将字符列作为因素导入,并且似乎使用within并不能为law $ variable创建正确的因子。因此,我会指定这样的因素以确保正确的顺序。

law  <- read.table(text="Subj  Cond    variable    value
1         2       lawyer      3
1         3      neutral      1
1         1      engineer     3.5
1         5      neutral_urb  3
1         4      neutral_rur  3.5
2         2      lawyer       1
2         3      neutral      3.5
2         1      engineer     4.5
2         5      neutral_urb  2
2         4      neutral_rur  3.5", header=TRUE, stringsAsFactors=FALSE)

law <- law[order(law$Subj),]

law$Subj <- as.factor(law$Subj)
law$variable <- factor(law$variable,levels =c("lawyer","neutral",
    "engineer","neutral_urb","neutral_rur"))

str(law)
'data.frame':   10 obs. of  4 variables:
 $ Subj    : Factor w/ 2 levels "1","2": 1 1 1 1 1 2 2 2 2 2
 $ Cond    : int  2 3 1 5 4 2 3 1 5 4
 $ variable: Factor w/ 5 levels "lawyer","neutral",..: 1 2 3 4 5 1 2 3 4 5
 $ value   : num  3 1 3.5 3 3.5 1 3.5 4.5 2 3.5