使用R读取文本文件

时间:2018-04-02 08:50:17

标签: r

我有txt格式的数据,我试图在R中读取数据帧,我不知道该怎么做。

1
[1] "Record 1"
2
[1] 1010286
3
[1] 7
4
[1] F
5
[1] 40
6
[1] 0
7
[1] SE
8
[1] Apt
9
[1] "Record 2"
10
[1] 1000152
11
[1] 5
12
[1] M
13
[1] <NA>
14
[1] 0
15
[1] <NA>
16
[1] Apt

1 个答案:

答案 0 :(得分:0)

据我所知,使用任何直接文件解析函数都无法将具有以下格式的txt文件直接强制转换为数据框。

档案:data.txt

1 [1] "Record 1" 2 [1] 1010286 3 [1] 7 4 [1] F 5 [1] 40 6 [1] 0 7 [1] SE 8 [1] Apt 
9 [1] "Record 2" 10 [1] 1000152 11 [1] 5 12 [1] M 13 [1] 14 [1] 0 15 [1] 16 [1] Apt

我尝试使用以下代码读取文件并将其分解为数据框:

library(stringr)

# Read and clean-up
data <- trimws(
          unlist(
            str_split(
               readLines("./Desktop/data.txt", warn = FALSE), 
               "[0-9]* \\[1\\]")))
data <- str_replace_all(data, '\\"', "")
data[which(data == "")] = NA  # Replace blank entry with NAs

df <- data.frame() # declare empty data frame to rbind later
i <- 2

# split to data frame
while (i<=length(data)){
      df <- rbind(df, data[(i+1):(i+7)], stringsAsFactors = F)
      i <- i + 8
}

# Set Column names (optional, but recommended)
colnames(df) = c("RecordNo", "Value2", "Gender", "Age", "Value5", "Value6", "Value7")

输出:

  RecordNo Value2 Gender  Age Value5 Value6  Value7
1  1010286      7      F   40      0     SE     Apt
2  1000152      5      M <NA>      0   <NA>     Apt