我试图奇怪地阅读的数据集在一列中包含了大量的多行文本。 read.csv("the_ill_formated_file.csv")
能够读取其中一些内容,其中一些列混合了某些行,然后抛出警告消息
Warning message:
In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
EOF within quoted string
fread("the_ill_formated_file.csv")
无法读取此错误消息
Error in fread("the_ill_formated_file.csv") :
Internal error. No eol2 immediately before line 30, 'p' instead
In addition: Warning message:
In fread("the_ill_formated_file.csv") :
Detected eol as \n\r, a highly unusual line ending. According to Wikipedia the Acorn BBC used this. If it is intended that the first column on the next row is a character column where the first character of the field value is \r (why?) then the first column should start with a quote (i.e. 'protected'). Proceeding with attempt to read the file.
以下是文件格式化的片段:
"comment_id", "comment", "post_date", "reply_count", "reply_ids"
1001, "This comment is multi-line with
space between each line!
Quite a fancy format this one", "2015-08-16" , 3, "{1,2,3}"
1002, "This second row is all on a single line, which is the usual format read.csv/fread in R will expect it", "2015-08-17" , 0, "{}"
在Excel中打开时,得到了相同的混合列。 在此先感谢您的帮助。