我想将单个列中包含的值拆分为新列。
我在文件中有一些看起来像这样的数据:
> df
V1
1 00006303657102064942660780914135165036 12867 15476 15473 15474 15397 14050
2 00006319625527159782351492300309533775 12867 15473 13678 13497 15397
3 00006327933867965144524703512179615086 12867 14245 15397 15473 15474
我想将每个值分成一个新列:V1,V2,V3,V4,V5和V6
我试过了:
df2 <- data.frame(do.call('rbind', strsplit(as.character(df$V1), ' ', fixed = FALSE)))
我最终输出如下:
X1 X2 X3 X4 X5 X6
1 00006303657102064942660780914135165036 12867 15476 15473 15474 15397
2 00006319625527159782351492300309533775 12867 15473 13678 13497 15397
3 00006327933867965144524703512179615086 12867 14245 15397 15473 15474
X7 X8
1 14050 00006303657102064942660780914135165036
2 00006319625527159782351492300309533775 12867
3 00006327933867965144524703512179615086 12867
某些v1值最终会出现在其他列中。它可能正在发生,因为行的末尾没有空格。我该如何正确执行?
感谢
答案 0 :(得分:1)
library(tidyr)
library(dplyr)
df <- read.table(
header = FALSE,
text = "
00006303657102064942660780914135165036 12867 15476 15473 15474 15397 14050
00006319625527159782351492300309533775 12867 15473 13678 13497 15397
00006327933867965144524703512179615086 12867 14245 15397 15473 15474
",
sep = "\n"
)
df %>%
separate(
V1,
into = paste0("V", 1:7),
# 'extra' allows the number of columns to differ by row
extra = "drop"
)
V1 V2 V3 V4 V5 V6 V7
1 00006303657102064942660780914135165036 12867 15476 15473 15474 15397 14050
2 00006319625527159782351492300309533775 12867 15473 13678 13497 15397 <NA>
3 00006327933867965144524703512179615086 12867 14245 15397 15473 15474 <NA>
答案 1 :(得分:1)
好老plyr
也有效:
txt <- readLines(n = 3)
1 00006303657102064942660780914135165036 12867 15476 15473 15474 15397 14050
2 00006319625527159782351492300309533775 12867 15473 13678 13497 15397
3 00006327933867965144524703512179615086 12867 14245 15397 15473 15474
library(plyr)
rbind.fill(
lapply(
strsplit(txt, " "),
function(y) {
as.data.frame(t(y),stringsAsFactors=FALSE) # via @Arun http://stackoverflow.com/questions/17308551/do-callrbind-list-for-uneven-number-of-column
}
)
)
# V1 V2 V3 V4 V5 V6 V7 V8
# 1 1 00006303657102064942660780914135165036 12867 15476 15473 15474 15397 14050
# 2 2 00006319625527159782351492300309533775 12867 15473 13678 13497 15397 <NA>
# 3 3 00006327933867965144524703512179615086 12867 14245 15397 15473 15474 <NA>