我有一个包含以下样本值的数据框。
[1] "entry.cei"
[2] "entry.lifecycle->hist.open.personal demand chequing account->exit.lifecycle->entry.cei"
[3] "entry.lifecycle->hist.open.personal demand savings account->exit.lifecycle->entry.cei"
[4] "entry.transaction->txn.no source available->exit.transaction->entry.cei"
[5] "entry.branch->exit.branch->entry.transaction->txn.in-branch->exit.transaction->entry.cei"
我需要将它们拆分为" - >"将它们放在不同的列中,比如V1,V2等。 例如:
V1 V2 V3 V4 V5 V6 V7
1 entry.cei
2 entry.lifecycle hist.open.personal demand chequing account exit.lifecycle entry.cei
3 entry.lifecycle hist.open.personal demand savings account exit.lifecycle entry.cei
我怎样才能在R中实现这一目标? 我试图将rbind与strsplit()一起使用,但我认为它需要相同数量的列。
答案 0 :(得分:1)
最简单的方法是使用gsub
用逗号替换->
,然后使用read.csv
。如果您在数据中有逗号,那么只需使用>
而不是逗号,它应该没问题。
read.csv(text = gsub("->", ",", x, fixed = TRUE), header = FALSE)
# V1 V2 V3 V4 V5 V6
# 1 entry.cei
# 2 entry.lifecycle hist.open.personal demand chequing account exit.lifecycle entry.cei
# 3 entry.lifecycle hist.open.personal demand savings account exit.lifecycle entry.cei
# 4 entry.transaction txn.no source available exit.transaction entry.cei
# 5 entry.branch exit.branch entry.transaction txn.in-branch exit.transaction entry.cei
或者
read.table(text = gsub("->", ",", x, fixed = TRUE), sep = ",", fill = TRUE)
只要先使所有列表元素的长度相同,您仍然可以使用rbind
和strsplit
。 length<-
替换功能可以帮助解决这个问题。
s <- strsplit(x, "->", fixed = TRUE)
data.frame(do.call(rbind, lapply(s, `length<-`, max(sapply(s, length)))))
# X1 X2 X3 X4 X5 X6
# 1 entry.cei <NA> <NA> <NA> <NA> <NA>
# 2 entry.lifecycle hist.open.personal demand chequing account exit.lifecycle entry.cei <NA> <NA>
# 3 entry.lifecycle hist.open.personal demand savings account exit.lifecycle entry.cei <NA> <NA>
# 4 entry.transaction txn.no source available exit.transaction entry.cei <NA> <NA>
# 5 entry.branch exit.branch entry.transaction txn.in-branch exit.transaction entry.cei
原始x
向量是
x <- c("entry.cei",
"entry.lifecycle->hist.open.personal demand chequing account->exit.lifecycle->entry.cei",
"entry.lifecycle->hist.open.personal demand savings account->exit.lifecycle->entry.cei",
"entry.transaction->txn.no source available->exit.transaction->entry.cei",
"entry.branch->exit.branch->entry.transaction->txn.in-branch->exit.transaction->entry.cei")