我编写了以下函数来添加" -1"作为我的数据帧因素的一个级别,然后将NA设置为" -1":
fun <- function(df) {
add_na_level <- function(x){
if(is.factor(x) & !"-1" %in% levels(x)) return(factor(x, levels=c(levels(x), "-1")))
x[is.na(x)]<-"-1"
return(x)
}
df<-sapply(df,add_na_level)
return(df)
}
, 但是当我在我的数据帧上使用它时,它的运行速度非常慢。 它是否具有丰富的产品线?
df<-sapply(df,add_na_level)
答案 0 :(得分:0)
你可以尝试
# The function
foo <- function(x){
x <- as.numeric(as.character(x))
x[is.na(x)] <- -1
as.factor(x)
}
# Run on numeric input vector
foo(c(1:4, NA))
[1] 1 2 3 4 -1
Levels: -1 1 2 3 4
转换data.frame
set.seed(2134)
df <- data.frame(matrix(sample(c(NA, 1:9), 25, T),nrow = 5))
str(df)
'data.frame': 5 obs. of 5 variables:
$ X1: int 7 5 4 5 2
$ X2: int 4 3 7 2 2
$ X3: int 9 8 4 4 4
$ X4: int 8 7 5 6 9
$ X5: int 8 7 7 4 NA
df[] <- lapply(df, foo)
str(df)
'data.frame': 5 obs. of 5 variables:
$ X1: Factor w/ 4 levels "2","4","5","7": 4 3 2 3 1
$ X2: Factor w/ 4 levels "2","3","4","7": 3 2 4 1 1
$ X3: Factor w/ 3 levels "4","8","9": 3 2 1 1 1
$ X4: Factor w/ 5 levels "5","6","7","8",..: 4 3 1 2 5
$ X5: Factor w/ 4 levels "-1","4","7","8": 4 3 3 2 1