重新标记后将因子转换为数字

时间:2014-10-29 17:14:00

标签: r data-structures integer coercion

此问题与Convert factor to integerHow to convert a factor to an integer\numeric without a loss of information有关,但类型强制的问题略有不同。

这两个前一个问题似乎与案例有关,是一个因素是从先前存在的类numeric或类integer的向量中明确构建而不重新标记levels。在这些情况下:

f <- factor(c("1","2","1","2"))
as.numeric(levels(f))[f]

返回

# [1] 1 2 1 2

但是当我重新标记关卡时:

f <- factor(c("1","2","1","2"))
f <- factor(f,
            levels = c(1, 2),
            labels = c("a", "b"))
as.numeric(levels(f))[f]

我会得到

# [1] NA NA NA NA
# Warning message:
# NAs introduced by coercion

,而

as.numeric(f)

返回

# [1] 1 2 1 2

在这种情况下,获取原始值的正确程序是什么?它只是as.numeric(f)吗?

如果相关:

> sessionInfo()
R version 3.1.2 RC (2014-10-28 r66890)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_IE.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_IE.UTF-8        LC_COLLATE=en_IE.UTF-8
 [5] LC_MONETARY=en_IE.UTF-8    LC_MESSAGES=en_IE.UTF-8
 [7] LC_PAPER=en_IE.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_IE.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
loaded via a namespace (and not attached):
[1] tools_3.1.2

1 个答案:

答案 0 :(得分:0)

如果您确定原始级别与基础因子/整数编码之间存在确切的对应关系,则可以使用as.numeric(f)。但是......如果原始载体是

 f <- factor(c("2","3","2","3"))

并且您将级别标签更改为alpha值,然后as.numeric(f)会产生误导性结果。因子编码始终以1L开头。