我正在尝试将一个|
分隔的文件加载到R中,并且在加载文件的文本部分时遇到了困难,该文本部分解析得不好。
数据如下:
c("ENMI|Close_Type|ENMIN|Close_Number|ENMIND|Close_Date_Time|Close_Description|Close_Status|Report_Type|Close_Text", "", "1001|GFP|194|3287|141|01/2020 12:00:00 AM|Summary Report|Signed|DIST|Report Status: Signed", " ;", "", "NAME: Rabbit, Roger UNIT NUMBER: 110 toontown", "", "", "", "For 01/2019 - 01/2020", "", "", "", "", "", "", "", "", "when the cat ran past;", "the mouse resting;", "beneath", "the shade of a pine tree.", "", "The cat was too busy", "to appreciate the opportunity." )
如果我尝试像往常一样使用read.csv
,尤其是:
df <- read.csv(text = c("ENMI|Close_Type|ENMIN|Close_Number|ENMIND|Close_Date_Time|Close_Description|Close_Status|Report_Type|Close_Text",
"", "1001|GFP|194|3287|141|01/2020 12:00:00 AM|Summary Report|Signed|DIST|Report Status: Signed",
" ;", "", "NAME: Rabbit, Roger UNIT NUMBER: 110 toontown",
"", "", "", "For 01/2019 - 01/2020", "", "", "", "", "", "",
"", "", "when the cat ran past;", "the mouse resting;", "beneath",
"the shade of a pine tree.", "", "The cat was too busy", "to appreciate the opportunity."),
header = T,
sep = "|",
quote = "")
当我希望将其放在ENMI
下时,该文本将被加载到Close_Text
下的第一列中。
> str(df)
'data.frame': 10 obs. of 10 variables:
$ ENMI : Factor w/ 10 levels " ;",..: 2 1 5 4 10 7 3 8 6 9
$ Close_Type : Factor w/ 2 levels "","GFP": 2 1 1 1 1 1 1 1 1 1
$ ENMIN : int 194 NA NA NA NA NA NA NA NA NA
$ Close_Number : int 3287 NA NA NA NA NA NA NA NA NA
$ ENMIND : int 141 NA NA NA NA NA NA NA NA NA
$ Close_Date_Time : Factor w/ 2 levels "","01/2020 12:00:00 AM": 2 1 1 1 1 1 1 1 1 1
$ Close_Description: Factor w/ 2 levels "","Summary Report": 2 1 1 1 1 1 1 1 1 1
$ Close_Status : Factor w/ 2 levels "","Signed": 2 1 1 1 1 1 1 1 1 1
$ Report_Type : Factor w/ 2 levels "","DIST": 2 1 1 1 1 1 1 1 1 1
$ Close_Text : Factor w/ 2 levels "","Report Status: Signed": 2 1 1 1 1 1 1 1 1 1
我希望将数据帧作为对10个变量的单次观察而加载。
任何帮助找到正确的问题,或者以前有人问过这个问题的人,都将获得难以置信的帮助-否则,我将找到另一种解析方法,可能通过readLines
和{{1} }。
感谢大家的时间!