在R中读取带有无引号的文本字段的竖线分隔文件

时间:2019-06-17 14:03:14

标签: r csv special-characters delimiter

我正在尝试将一个|分隔的文件加载到R中,并且在加载文件的文本部分时遇到了困难,该文本部分解析得不好。

数据如下:

c("ENMI|Close_Type|ENMIN|Close_Number|ENMIND|Close_Date_Time|Close_Description|Close_Status|Report_Type|Close_Text", "", "1001|GFP|194|3287|141|01/2020 12:00:00 AM|Summary Report|Signed|DIST|Report Status:  Signed",  "                         ;", "", "NAME: Rabbit, Roger                  UNIT NUMBER: 110 toontown",  "", "", "", "For 01/2019 - 01/2020",  "", "", "", "", "", "",  "", "", "when the cat ran past;",  "the mouse resting;",  "beneath", "the shade of a pine tree.", "", "The cat was too busy",  "to appreciate the opportunity." )

如果我尝试像往常一样使用read.csv,尤其是:

df <- read.csv(text = c("ENMI|Close_Type|ENMIN|Close_Number|ENMIND|Close_Date_Time|Close_Description|Close_Status|Report_Type|Close_Text", 
                        "", "1001|GFP|194|3287|141|01/2020 12:00:00 AM|Summary Report|Signed|DIST|Report Status:  Signed", 
                        "                             ;", "", "NAME: Rabbit, Roger                  UNIT NUMBER: 110 toontown", 
                        "", "", "", "For 01/2019 - 01/2020", "", "", "", "", "", "", 
                        "", "", "when the cat ran past;", "the mouse resting;", "beneath", 
                        "the shade of a pine tree.", "", "The cat was too busy", "to appreciate the opportunity."), 
               header = T, 
               sep = "|", 
               quote = "")

当我希望将其放在ENMI下时,该文本将被加载到Close_Text下的第一列中。

> str(df)

'data.frame':   10 obs. of  10 variables:
 $ ENMI             : Factor w/ 10 levels "                             ;",..: 2 1 5 4 10 7 3 8 6 9
 $ Close_Type       : Factor w/ 2 levels "","GFP": 2 1 1 1 1 1 1 1 1 1
 $ ENMIN            : int  194 NA NA NA NA NA NA NA NA NA
 $ Close_Number     : int  3287 NA NA NA NA NA NA NA NA NA
 $ ENMIND           : int  141 NA NA NA NA NA NA NA NA NA
 $ Close_Date_Time  : Factor w/ 2 levels "","01/2020 12:00:00 AM": 2 1 1 1 1 1 1 1 1 1
 $ Close_Description: Factor w/ 2 levels "","Summary Report": 2 1 1 1 1 1 1 1 1 1
 $ Close_Status     : Factor w/ 2 levels "","Signed": 2 1 1 1 1 1 1 1 1 1
 $ Report_Type      : Factor w/ 2 levels "","DIST": 2 1 1 1 1 1 1 1 1 1
 $ Close_Text       : Factor w/ 2 levels "","Report Status:  Signed": 2 1 1 1 1 1 1 1 1 1

我希望将数据帧作为对10个变量的单次观察而加载。

任何帮助找到正确的问题,或者以前有人问过这个问题的人,都将获得难以置信的帮助-否则,我将找到另一种解析方法,可能通过readLines和{{1} }。

感谢大家的时间!

0 个答案:

没有答案