如何在R中将一堆列转换为数字

时间:2015-01-25 00:07:12

标签: r

这是我的df:

    structure(list(Time = structure(c(3L, 4L, 5L, 6L, 1L, 2L), .Label = c("1/20/15 10:26 AM", 
"1/20/15 11:26 AM", "1/20/15 6:26 AM", "1/20/15 7:26 AM", "1/20/15 8:26 AM", 
"1/20/15 9:26 AM"), class = "factor"), Server1 = structure(c(1L, 
4L, 5L, 2L, 3L, 6L), .Label = c("1.08", "12.08", "15", "4", "7.92", 
"No data"), class = "factor"), Server2 = structure(c(1L, 2L, 
4L, 4L, 3L, 4L), .Label = c("1.67", "4.33", "7.75", "No data"
), class = "factor"), Server3 = structure(c(1L, 2L, 3L, 5L, 4L, 
6L), .Label = c("0.83", "2.33", "3.58", "3.92", "4", "No data"
), class = "factor")), .Names = c("Time", "Server1", "Server2", 
"Server3"), row.names = c(NA, -6L), class = "data.frame")

我需要能够将所有单元格转换为数字。当我做的时候

data$Server1<-as.numeric(data$Server1)

我收到此错误:

Error in `$<-.data.frame`(`*tmp*`, "Server", value = numeric(0)) : 
  replacement has 0 rows, data has 6

此外,我希望能够通过不单独引用数据$ Server1或数据$ Server2将列转换为数字,我可能有几百列。

有没有更好的方法将所有列转换为数字并将非数字单元格替换为NA?

4 个答案:

答案 0 :(得分:5)

您可以使用lapply()在感兴趣的列中应用函数。我认为您希望保留Time列完好无损,因此我们可以使用[-1]索引保留该列。

## change all 'No data' elements to NA
is.na(df) <- df == "No data"
## for columns 2:4, drop extra factor levels and convert to numeric
df[-1] <- lapply(droplevels(df)[-1], function(x) as.numeric(levels(x))[x])

给出了

df
              Time Server1 Server2 Server3
1  1/20/15 6:26 AM    1.08    1.67    0.83
2  1/20/15 7:26 AM    4.00    4.33    2.33
3  1/20/15 8:26 AM    7.92      NA    3.58
4  1/20/15 9:26 AM   12.08      NA    4.00
5 1/20/15 10:26 AM   15.00    7.75    3.92
6 1/20/15 11:26 AM      NA      NA      NA

但是当您通过在读取调用中使用na.strings参数将数据读入R时,您可以解决此问题,这样就无需在读取后修复列。

read.table(file, na.strings = "No data")

答案 1 :(得分:3)

使用dplyr

library(dplyr)
df %>% mutate_each(funs(as.numeric(levels(.))[.]), -Time)

你得到:

#              Time Server1 Server2 Server3
#1  1/20/15 6:26 AM    1.08    1.67    0.83
#2  1/20/15 7:26 AM    4.00    4.33    2.33
#3  1/20/15 8:26 AM    7.92      NA    3.58
#4  1/20/15 9:26 AM   12.08      NA    4.00
#5 1/20/15 10:26 AM   15.00    7.75    3.92
#6 1/20/15 11:26 AM      NA      NA      NA

答案 2 :(得分:1)

data <- replace(data, data == "No data", NA)

cbind(data[1], apply(data[-1], 2, function(x) as.double(as.character(x))))
              Time Server1 Server2 Server3
1  1/20/15 6:26 AM    1.08    1.67    0.83
2  1/20/15 7:26 AM    4.00    4.33    2.33
3  1/20/15 8:26 AM    7.92      NA    3.58
4  1/20/15 9:26 AM   12.08      NA    4.00
5 1/20/15 10:26 AM   15.00    7.75    3.92
6 1/20/15 11:26 AM      NA      NA      NA

答案 3 :(得分:1)

我的选择是

df[, 2:ncol(df)] <- apply(df[, 2:ncol(df)], 2, as.numeric)

因为这似乎是最重要的。无需更改“无数据”。到了&#39; NA&#39;因为这是自动完成的,您将收到一条警告消息,通知发生了这种情况。