使用data.table :: fread()读取txt文件中的引用值

时间:2017-04-07 11:08:00

标签: r data.table

我有一个简单的txt文件:(值在引号中并由制表符分隔)

"Col1" "Col2" "Col3"  
"A" "1,1" "C"  
"B" "2,1" "C"  
"C" "3,1" "C"  

我想使用fread()阅读该文件。由于中间列应该是数字,我使用dec = ","

但是,命令:

fread("myFile.txt", sep = "\t", dec = ",", header = TRUE, stringsAsFactors = FALSE)

无法将Col2读为数字。指定colClasses = c("character", "numeric", "character")没有任何区别。

有没有办法使用fread()准确读取文件(没有后期处理)?

非常感谢任何帮助

1 个答案:

答案 0 :(得分:2)

我将在之前的评论中稍微回溯一下;看起来read.table确实成功处理了这种情况。

使用以下对象进行演示

df <- data.frame(
    Col1 = LETTERS[1:3], 
    Col2 = sub(".", ",", 1:3 + 0.1, fixed = TRUE), 
    Col3 = rep("C", 3), 
    stringsAsFactors = FALSE
)

在磁盘上看起来像这样:

write.table(
    df,
    sep = "\t", 
    row.names = FALSE
)
# "Col1"    "Col2"  "Col3"
# "A"   "1,1"   "C"
# "B"   "2,1"   "C"
# "C"   "3,1"   "C"

将此内容写入临时文件

tf <- tempfile()
write.table(
    df,
    file = tf,
    sep = "\t", 
    row.names = FALSE
)
当提供正确的参数时,

read.table将第二列处理为numeric

str(read.table(tf, header = TRUE, sep = "\t", dec = ","))
# 'data.frame': 3 obs. of  3 variables:
#  $ Col1: chr  "A" "B" "C"
#  $ Col2: num  1.1 2.1 3.1
#  $ Col3: chr  "C" "C" "C"

更方便的是,read.delim2也可以使用:

str(read.delim2(tf, header = TRUE))
# 'data.frame': 3 obs. of  3 variables:
#  $ Col1: chr  "A" "B" "C"
#  $ Col2: num  1.1 2.1 3.1
#  $ Col3: chr  "C" "C" "C"

我无法真正说明为什么fread无法解决这个问题,但如果这是一个足够常见的情况,那么软件包维护者可能会想要考虑它。您可以考虑在GitHub存储库上打开问题单并询问此问题。