将文本行转换为R中的矩阵

时间:2018-08-13 19:50:39

标签: r string

我有一个用R中的readLines读取的文件。在索引sndx和endx之间是一个用空格分隔的数字表。我想将它们转换成矩阵。例如,玩具示例文件为:

======
3 5   # this is how I know sndx and endx
Some text
1  123.  456. 789.
2  345.  678. 123.
3  235.  123. 345.
More text
======

所需的输出将是矩阵:

1  123.  456. 789.
2  345.  678. 123.
3  235.  123. 345.

有没有办法以这种方式提取数字线?

1 个答案:

答案 0 :(得分:0)

示例:

"Some text
endx
1  123.  456. 789.
2  345.  678. 123.
3  235.  123. 345.
sndx
More text"

使用strsplit

char_vec <- trimws(readClipboard())    

# Need the string after 'endx'
str_start <- grep('endx', char_vec)+1

# And the string before 'sndx'
str_end <- grep('sndx', char_vec)-1

# The output here is a matrix but we need the transpose of the output  
t(sapply(str_start:str_end, function(z){
  u <- char_vec[z]
  ret <- strsplit(x = gsub('\\.', "", u), split = '[[:space:]]{1,5}')[[1]]
  return(ret)
}))

输出:

> t(sapply(str_start:str_end, function(z){
+   u <- char_vec[z]
+   ret <- strsplit(x = gsub('\\.', "", u), split = '[[:space:]]{1,5}')[[1]]
+   return(ret)
+ }))
     [,1] [,2]  [,3]  [,4] 
[1,] "1"  "123" "456" "789"
[2,] "2"  "345" "678" "123"
[3,] "3"  "235" "123" "345"