我从pdf表中提取了数据,但它返回了带有字符串的向量。我想成为一个矩阵。
例如
[1] "XX/R011680/2 Fun 9-10 XX/R008108/2 No fun *N/A"
[2] "XX/X002103/2 Fun 8-8.9 XX/S00257X/2 No fun *N/A"
[3] "XX/X011443/2 Fun 8-8.9"
[4] "XX/X008728/2 No fun 7-7.9"
可以通过某种方式将其切开。这样它就变成了这样的矩阵。
[,1] [,2] [,3] [,4] [,5] [,6]
[1] "XX/X011680/2" "Fun" "9-10" "XX/X008108/2" "No fun" "*N/A"
[2] "XX/X002103/2" "Fun" "8-8.9" "XX/X00257X/2" "No fun" "*N/A"
[3] "XX/X011443/2" "Fun" "8-8.9" NA NA NA
[4] "XX/X008728/2" "No fun" "7-7.9" NA NA NA
还是这样比较方便? 行的位置无关紧要,因为我以后可以对其进行排序。
[,1] [,2] [,3]
[1] "XX/X011680/2" "Fun" "9-10"
[2] "XX/X008108/2" "No fun" "*N/A"
[3] "XX/X002103/2" "Fun" "8-8.9"
[4] "XX/X00257X/2" "No fun" "*N/A"
[5] "XX/X011443/2" "Fun" "8-8.9"
[6] "XX/X008728/2" "No fun" "7-7.9"
答案 0 :(得分:0)
假设在下面的注释中可重复地给定输入L
,请删除双引号,将2个或更多空格转换为逗号,然后使用read.table进行读取:
L2 <- gsub('"', '', gsub(' +', ',', L))
read.table(text = L2, as.is = TRUE, sep = ",", fill = TRUE)
L <-
c("\"XX/R011680/2 Fun 9-10 XX/R008108/2 No fun *N/A\"",
"\"XX/X002103/2 Fun 8-8.9 XX/S00257X/2 No fun *N/A\"",
"\"XX/X011443/2 Fun 8-8.9\"",
"\"XX/X008728/2 No fun 7-7.9\""
)