计算R中各列的因子

时间:2017-08-22 23:38:23

标签: r dictionary dataframe type-conversion xts

我需要将一系列character列转换为factors。然后,我需要跨列的factors在转换为numeric类型时映射到相应的枚举值。

as.numeric(as.factor(characterColumnDataFrame))

这当前返回每个列独立的因子,因此结果数字与列之间的相应字符串不匹配。

想要尝试避免转换一列,然后查找并映射第一列中的枚举。

3 个答案:

答案 0 :(得分:2)

创建因素时使用DateTimeFormatterlevels=包含字符列,而DF的因子列具有相同级别DF2

levs

这可以写成像这样的单行:

# test data frame
DF <- as.data.frame(matrix(letters,, 2), stringsAsFactors = FALSE) 

DF2 <- DF
levs <- sort(unique(unlist(DF)))
DF2[] <- lapply(DF2, factor, levels = levs)

答案 1 :(得分:1)

Hadley Wickham的fct_unify()包中的forcats函数统一了因子列表中的级别。

# using G. Grothendieck's test data frame
DF <- as.data.frame(matrix(letters,, 2), stringsAsFactors = FALSE)
str(DF)
'data.frame': 13 obs. of  2 variables:
 $ V1: chr  "a" "b" "c" "d" ...
 $ V2: chr  "n" "o" "p" "q" ...
DF[] <- lapply(DF, factor)
str(DF)
'data.frame': 13 obs. of  2 variables:
 $ V1: Factor w/ 13 levels "a","b","c","d",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ V2: Factor w/ 13 levels "n","o","p","q",..: 1 2 3 4 5 6 7 8 9 10 ...
DF[] <- forcats::fct_unify(DF)
str(DF)
'data.frame': 13 obs. of  2 variables:
 $ V1: Factor w/ 26 levels "a","b","c","d",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ V2: Factor w/ 26 levels "a","b","c","d",..: 14 15 16 17 18 19 20 21 22 23 ...

或作为单行代码来产生统一因子水平的数量:

DF[] <- lapply(forcats::fct_unify(lapply(DF, factor)), as.numeric)
DF
   V1 V2
1   1 14
2   2 15
3   3 16
4   4 17
5   5 18
6   6 19
7   7 20
8   8 21
9   9 22
10 10 23
11 11 24
12 12 25
13 13 26

答案 2 :(得分:0)

library(zoo)
test = xtsCharacterObjectWithManyColumns
xts::coredata(test) = as.numeric(factor(test, levels = unique(test), ordered = T))
base::storage.mode(test) = "numeric"