如何将文本文件中的多个表转换为一个带有附加列的表?

时间:2015-05-19 15:33:26

标签: r readline

我的文本文件" myfile.txt"包含许多具有相同列(名称,年龄,重量,专业)的表。它看起来像:

table_ID 001  
John | 38 | 165 | Computer scientist  
Mary | 22 | 122 | Student  

table_ID 002  
Patric| 44 | 105 | Teacher  
Kim | 56 | 155 | Salesman  
Kate | 33 | 133 | Student  
...

table_ID 100  
Peter| 44 | 105 | Teacher  
Han | 56 | 155 | Salesman  
Ken | 33 | 133 | Student  

I want to output a data.frame with an additional column ("table_ID"), which looks like:

table_ID name age weight profession  
001 John  38  165  Computer scientist  
001 Mary  22  122  Student  
002 Patric 44 105  Teacher  
002 Kim  56  155   Salesman  
002 Kate 33  133   Student  
...

100 Peter 44 105 Teacher  
100 Han  56  155 Salesman  
100 Ken 33  133  Student 

我如何在R中执行此操作?非常感谢。

1 个答案:

答案 0 :(得分:1)

你可以尝试

library(tidyr) 
lines <- readLines('paul.txt')
indx <- grepl('table_ID', lines)
lst <- split(lines, cumsum(indx))
names(lst) <- sub('\\D+', '', sapply(lst,`[`, 1))
res <- unnest(lapply(lst, function(x)
     read.table(text=x[-1], header=FALSE, sep="|")), table_ID)