读取R空间中的文件分开,跳过第一行

时间:2013-01-18 05:56:11

标签: r

我想将R中的文件读入M by N的矩阵中。

该文件格式如下:

# /n/home11/tros/sar/tests/mars/abro 250
# /n/home11/tros/sar/tests/mars/abro 230
# /n/home11/tros/sar/tests/mars/abro 20
# /n/home11/tros/sar/tests/mars/abro 20
# T (M rows,N cols)
# M 3
# N 4
7.947363550e+03 1.066183995e+04 3.896434554e+03 8.319875735e+03
1.600281531e+04 1.991086422e+04 1.628421819e+03 1.239507171e+04 
7.430547003e+03 2.349262184e+03 4.883555574e+03 4.986597752e+02

应跳过第一行(带#符号的所有行),但可以(可能)从标题(带#行的行)中读取M和N.

然后应该读取尺寸为M乘N(在这种情况下为3乘4)的数字矩阵,注意分隔符只是空格(NOT tabs)。

感谢。

2 个答案:

答案 0 :(得分:4)

read.table默认会跳过以#开头的行:

s <- "# /n/home11/tros/sar/tests/mars/abro 250
# /n/home11/tros/sar/tests/mars/abro 230
# /n/home11/tros/sar/tests/mars/abro 20
# /n/home11/tros/sar/tests/mars/abro 20
# T (M rows,N cols)
# M 3
# N 4
7.947363550e+03 1.066183995e+04 3.896434554e+03 8.319875735e+03
1.600281531e+04 1.991086422e+04 1.628421819e+03 1.239507171e+04 
7.430547003e+03 2.349262184e+03 4.883555574e+03 4.986597752e+02
"

read.table(header=FALSE, text=s)
##          V1        V2       V3         V4
## 1  7947.364 10661.840 3896.435  8319.8757
## 2 16002.815 19910.864 1628.422 12395.0717
## 3  7430.547  2349.262 4883.556   498.6598

您可能希望使用text=而不是使用file=,而是提供从中读取数据的文件名。

答案 1 :(得分:0)

Lines <- readLines(s)
M <- as.numeric( sub("^#\\sM" ,"" , Lines[grep("^#\\sM",Lines)]) )
M
#[1] 3
N <- as.numeric( sub("^#\\sN" ,"" , Lines[grep("^#\\sN",Lines)]) )
 dat <- read.table(text=Lines[grep("^[^#]",Lines)]) 
 dat
         V1        V2       V3         V4
1  7947.364 10661.840 3896.435  8319.8757
2 16002.815 19910.864 1628.422 12395.0717
3  7430.547  2349.262 4883.556   498.6598