read.table - 将长文本拆分为不同的列

时间:2014-08-15 10:31:52

标签: r

我正在尝试使用下面的代码导入以下数据,我得到的输出将长文本拆分为不同的列。

postcode <- read.table(file="workbook16.txt", header = T, fill = T)


Seq Suburb  Postcode
1   Melbourne   3000
2   Melbourne   3001
3   East Melbourne  3002
4   West Melbourne  3003
5   Melbourne   3004
6   World Trade Centre  3005
7   Southbank   3006
8   Docklands   3008
9   University Of Melbourne 3010

我的错误在哪里?

1 个答案:

答案 0 :(得分:0)

一种方法是使用readLines

进行阅读
lines <- readLines(n=10)
Seq Suburb  Postcode
1   Melbourne   3000
2   Melbourne   3001
3   East Melbourne  3002
4   West Melbourne  3003
5   Melbourne   3004
6   World Trade Centre  3005
7   Southbank   3006
8   Docklands   3008
9   University Of Melbourne 3010

lines1 <- lines[-1]
 dat <- as.data.frame(do.call(rbind,strsplit(lines1, '(?<=[0-9])\\s+|\\s+(?=[0-9])', perl=T)))
colnames(dat) <- lines[1]
colnames(dat) <- scan(text=lines[1],what="")
dat[,c(1,3)] <- lapply(dat[,c(1,3)], function(x) as.numeric(as.character(x)))

dat
#  Seq                  Suburb Postcode
#1   1               Melbourne     3000
#2   2               Melbourne     3001
#3   3          East Melbourne     3002
#4   4          West Melbourne     3003
#5   5               Melbourne     3004
#6   6      World Trade Centre     3005
#7   7               Southbank     3006
#8   8               Docklands     3008
#9   9 University Of Melbourne     3010