在数据框中将因子转换为数字

时间:2014-07-23 10:09:22

标签: r

对于我想介绍给我的数据的一些函数,我需要在数据框中使用数值。现在它们是因子格式。

有没有简单的方法来转变"整个数据框成数字?

' dput'

的一部分
"0.966968221", "0.971526427", "0.975908363", "0.976354638", 
    "0.983503732", "0.984850291", "0.985224666", "0.987182132", 
    "0.987468192", "0.988309086", "0.994685984", "0.996238630", 
    "0.997917853", "0.998762891", "0.999968143", "1.000000000"
    ), class = "factor")), .Names = c("10", "33.95", "58.66", 
"84.42", "110.21", "134.16", "164.69", "199.1", "234.35", "257.19", 
"361.84", "432.74", "506.34", "581.46", "651.71", "732.59", "817.56", 
"896.24", "971.77", "1038.91"), row.names = c("at1g01050.1", 
"at1g01080.1", "at1g01090.1", "at1g01320.2", "at1g01470.1", "at1g01800.1"
), class = "data.frame")

data.frame中的值类:

> class(tbl_alles[103,5])
[1] "factor"
> class(tbl_alles[553,12])
[1] "factor"

到目前为止我已尝试过:

首先尝试:

tbl_alles <- sapply(tbl_alles, as.numeric) ## Changing the values in the data frame

第二次尝试:

> as.numeric(as.character(tbl_alles))
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Warning message:
NAs introduced by coercion 

第三次尝试:

 > as.numeric(levels(tbl_alles))[tbl_alles]
Error in as.numeric(levels(tbl_alles))[tbl_alles] : 
  invalid subscript type 'list

任何解决方案?

1 个答案:

答案 0 :(得分:4)

这样做的一种方法:

tbl_alles[sapply(tbl_alles, is.factor)] <- lapply(tbl_alles[sapply(tbl_alles, is.factor)], function(x) as.numeric(as.character(x)))

此函数将查找factor类型的列并将其转换为类numeric

另一种选择(可能更快一点)是使用data.table

library(data.table)
setDT(tbl_alles)[, names(tbl_alles) := lapply(.SD, function(x) if(is.factor(x)) as.numeric(as.character(x)) else x)]

如果您的整个数据集属于factor类型,并且您希望将所有列传输到numeric类型,则可以执行

tbl_alles[] <- lapply(tbl_alles, function(x) as.numeric(as.character(x)))