我的文本文件" myfile.txt"包含许多具有相同列(名称,年龄,重量,专业)的表。它看起来像:
table_ID 001
John | 38 | 165 | Computer scientist
Mary | 22 | 122 | Student
table_ID 002
Patric| 44 | 105 | Teacher
Kim | 56 | 155 | Salesman
Kate | 33 | 133 | Student
...
table_ID 100
Peter| 44 | 105 | Teacher
Han | 56 | 155 | Salesman
Ken | 33 | 133 | Student
I want to output a data.frame with an additional column ("table_ID"), which looks like:
table_ID name age weight profession
001 John 38 165 Computer scientist
001 Mary 22 122 Student
002 Patric 44 105 Teacher
002 Kim 56 155 Salesman
002 Kate 33 133 Student
...
100 Peter 44 105 Teacher
100 Han 56 155 Salesman
100 Ken 33 133 Student
我如何在R中执行此操作?非常感谢。
答案 0 :(得分:1)
你可以尝试
library(tidyr)
lines <- readLines('paul.txt')
indx <- grepl('table_ID', lines)
lst <- split(lines, cumsum(indx))
names(lst) <- sub('\\D+', '', sapply(lst,`[`, 1))
res <- unnest(lapply(lst, function(x)
read.table(text=x[-1], header=FALSE, sep="|")), table_ID)