Question

f1 <- c("a", "b", "c")

f2 <- c("x", "e", "t")

f1 <-factor(f1)

f1
#[1] a b c
#Levels: a b c


str(f1)
#Factor w/ 3 levels "a","b","c": 1 2 3

f2 <-factor(f2)

f2
#[1] x e t
#Levels: e t x

str(f2)
#Factor w/ 3 levels "e","t","x": 3 1 2

如上所述，为什么f2 "e"被视为3？当按字母顺序考虑时，它不应该是1吗？

Answer 1

您将f2设置为c("x", "e", "t")因此＆＃34; x＆＃34;因子3（从字母顺序）仍然处于第一位置，而＆＃34; e＆＃34;处于第二位置的确有因子1

    f2 <- factor(c("x", "e", "t"))
    str(f2)
    Factor w/ 3 levels "e","t","x": 3 1 2

str(f2)结果的说明：

f2属于因子类型，这意味着这些值不应按原样进行，但会被编码为因子
f2有3个级别的因子（3个不同的值），按顺序排列＆＃34; e＆＃34;，＆＃34; t＆＃34;，＆＃34; x＆＃ 34;，所以＆＃34; e＆＃34;被编码为因子1，＆＃34; t＆＃34;被编码为因子2和＆＃34; x＆＃34;被编码为因子3
f2包含3个编码值3,1,2

要去分解：

取第一个编码值（3），并将其替换为其级别（＆＃34; x＆＃34; =因子3），
然后是第二个编码值（1），并将其替换为其级别（＆＃34; e＆＃34; =因子1），

...

然后是最后一个编码值（2），并将其替换为其级别（＆＃34; t＆＃34; =因子2）

=＆GT;你得到＆＃34; x＆＃34;，＆＃34; e＆＃34;，＆＃34; t＆＃34;。

让我们在f2

的末尾添加一个额外的值（＆＃34; e＆＃34;再次）

    f2[4] <-  "e"
    str(f2)
    Factor w/ 3 levels "e","t","x": 3 1 2 1

你可以看到一个因子1编码＆＃34; e＆＃34;现在排在第4位。

f2现在代表：＆＃34; x＆＃34;，＆＃34; e＆＃34;，＆＃34; t＆＃34;，＆＃34; e＆＃34;。

Answer 2

str（f2）按字母顺序显示字母，但数字取决于f2对象中字母占用的位置。

如果f2是x e t

 Levels are e t x (in order)

 Numbers for the above letters would be: (in order)

 e = 1
 t = 2
 x = 3

 str gives number sequence according to the place occupied by the letters in     
 the original f2 object , i.e. x, e, t = 3,1,2

希望这有帮助。

一个因子如何在R中自动排序它的水平？

2 个答案: