Question

我正在使用

将csv数据导入R.

data <- read.csv(file="file_name.csv")

此数据有9列和5000行，数据值是实数。现在我想将此数据用作数据框。但是第一列出现了一些级别。我不想要这个级别。

以下是.csv格式的示例数据

enter image description here

任何人都可以帮我从导入R的第一列中删除级别。

这是我的尝试：

data$col_1 = as.numeric(as.character(data$col_1))

但它显示警告：

Warning message:
NAs introduced by coercion

Answer 1

read.csv基本上是read.table的封装，关闭stringsAsFactors即可。

data <- read.csv(file="filename", stringsAsFactors=FALSE)

然后我想该列将被视为characters。然后你可以这样做转换为数字。：

data$col <- as.numeric(data$col)

注意：如果您有一个仅包含数字的干净列，read.csv将智能地读入数字，如果它以factors读入，则表示R检测到文本或非数字的内容。您可能需要注意warnings，看看哪些记录由于什么原因转换为NA。

例如，我有一个csv文件。

enter image description here

当我读入时，id列将被视为characters，因为有一行包含ohyeah（如果它为空或NA，则R仍将列视为数字）。我建议你首先对已被污染的记录进行分组，看看它是否是一个大问题。

> subset(data, is.na(as.numeric(id)))
  name     id
4  dan ohyeah
Warning message:
In eval(expr, envir, enclos) : NAs introduced by coercio