Question

我需要使用代码读取CSV文件

originalDataset <- fread("file.csv", 
                         encoding = "UTF-8", sep = ",", 
                         select = c("OperationDate","TenantID","Type","EMail","ClientType","Param4"))

Sep是“，”，但是有时会有意外的字符串格式返回到内部，因此它将一行分成两行，如下面的第四行。在这种情况下，我得到了错误：

期望14个字段，但找到12个。

读入文件时如何处理此类数据？预先谢谢你。

数据

ID,DBID,OperationDate,TID,Type,EMail,ClientIPAddress,ClientType,Param1,Param2,Param3,Param4,Param5,Detail
619,1,2019-08-08 03:01:00.310,2300,101,a@example.com,3.10.226.203,C,639,0,0,NULL,NULL,ANULL
402,1,2019-08-08 02:50:51.300,2300,109,fa@example.com,3.10.226.203,C,639,0,0,NULL,NULL,NULL
395,1,2019-08-08 02:50:19.377,2300,101,a@example.com,3.10.226.203,C,6387,0,0,NULL,NULL,NULL
341,1,2019-08-08 01:46:21.390,2300,104,a@example.com,3.10.226.203,A,1352,23,234630,Here is an unexpected string
which has return,NULL,NULL
329,1,2019-08-08 01:45:52.673,2300,101,a@example.com,39.1.226.203,A,6411,0,0,NULL,NULL,NULL

Answer 1

您的原件出现此错误：

data

当我建议使用参数originalDataset <- fread("test.csv", encoding = "UTF-8", sep = ",", select = c("OperationDate","TID","Type","EMail","ClientType","Param4")) Warning message: In fread("test.csv", encoding = "UTF-8", sep = ",", select = c("OperationDate", : Stopped early on line 5. Expected 14 fields but found 12. Consider fill=TRUE and comment.char=. First discarded non-empty line: <<341,1,2019-08-08 01:46:21.390,2300,104,a@example.com,3.10.226.203,A,1352,23,234630,Here is an unexpected >>时，将正确读取数据：

fill=TRUE

读取csv文件时如何处理意外数据？

1 个答案: