Question

我有一个大表来读入R，文件格式为.txt。在R中，我使用read.table函数，但在读取时出错。出现以下错误消息：

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
  line 28 did not have 23 elements

似乎（从第一行开始计算而不计算我指定的标题skip=），第28行中的数据缺少元素。我正在通过过滤这一行来寻找自动更正此问题的方法。现在，我甚至不可能在文件中阅读，所以我无法在R中操作...任何建议都非常感谢：）

Answer 1

以下是我的方法：使用选项read.table拨打fill=TRUE，然后排除之后没有填写所有字段的行（通过调用count.fields）。

示例：

# 1. Data generation, and saving in 'tempfile'
cat("1 John", "2 Paul", "7 Pierre", '9', file = "tempfile", sep = "\n")

# 2. read the data:
data = read.table('tempfile',fill=T)

# 3. exclude incomplete data
c.fields = count.fields('tempfile')
data = data[ - (which(c.fields) != max(c.fields)),]

（编辑以自动获取行数）

Answer 2

当您的数据中有哈希符号（＃）时，也会出现该错误。

如果是这种情况，只需将选项comment.char更改为comment.char = ""

read.table("file.txt", comment.char = "")

read.table函数用于读取R中的不完整数据

2 个答案: