read.fwf错误“第x行没有5个元素” - 可能是由于特殊字符

时间:2016-12-07 06:25:32

标签: r read.table read.fwf

fwf读取固定宽度文本:

lines = NULL
lines[1] = '                BUTORPHANOL TARTRATE            VIAL       2 MG/ML         '
lines[2] = '                B3/AZEL AC/ZINC/B6/COPPER/FA    TABLET     600-5-500       '

write(lines, 'lines.txt')
read.fwf('lines.txt', width = c(16, 32, 11, 12, 3), as.is = T, skip = 0)
# works:
#   V1 V2                               V3          V4           V5
# 1 NA BUTORPHANOL TARTRATE             VIAL        2 MG/ML      NA
# 2 NA B3/AZEL AC/ZINC/B6/COPPER/FA     TABLET      600-5-500    NA

添加其他行会导致错误:

lines[3] = '                C/B-6/NIACIN/FA/B12/HERB#192    CAPSULE    60-5-2.5MG      ' # this line causes error

write(lines, 'lines.txt')
read.fwf('lines.txt', width = c(16, 32, 11, 12, 3), as.is = T, skip = 0)
  

扫描错误(file = file,what = what,sep = sep,quote = quote,dec   = dec,:                         第3行没有5个元素

唯一可以猜到的是第3行有一些特殊字符。有人可以帮忙吗?感谢。

1 个答案:

答案 0 :(得分:4)

我们可以指定comment.char

read.fwf('lines.txt', width = c(16, 32, 11, 12, 3), as.is = TRUE, skip = 0, comment.char="")
#  V1                               V2          V3           V4 V5
#1 NA BUTORPHANOL TARTRATE             VIAL        2 MG/ML      NA
#2 NA B3/AZEL AC/ZINC/B6/COPPER/FA     TABLET      600-5-500    NA
#3 NA C/B-6/NIACIN/FA/B12/HERB#192     CAPSULE     60-5-2.5MG   NA