我想将向量转换为数据框。向量由唯一的ID组成,其后是其他字段。这些字段是详尽无遗的,大约有30个不同的字段,都标有反斜杠。
\ID a
\description text yes
\definition text yes
\other.info text yes
\ID b
\definition text yes
\other.info text yes
\ID d
\description text yes
\other.info text yes
\translation text yes
我需要将其转换为:
ID description definition other.info translation
a text yes text yes text yes
b text yes text yes
d text yes text yes text yes
谢谢您的帮助
答案 0 :(得分:0)
这里有些肮脏但可以完成工作:
library(stringr) # Will use str_extract() with some regex
library(magrittr) # pipes: %>%
library(data.table) # rbindlist (I think dplyr has bind_rows() which is similar)
split(vect, cumsum(grepl("ID", vect))) %>%
lapply(function(x) setNames(data.frame(t(str_extract(x, "\\w+$"))), str_extract(x, "^.+\\s")) ) %>%
rbindlist(fill = TRUE) %>%
setNames(gsub("text|\\\\", "", names(.)))
ID description definition other.info translation
1: a yes yes yes <NA>
2: b <NA> yes yes <NA>
3: d yes <NA> yes yes
数据:
vect <- c("\\ID a", "\\description text yes", "\\definition text yes", "\\other.info text yes",
"\\ID b", "\\definition text yes", "\\other.info text yes", "\\ID d",
"\\description text yes", "\\other.info text yes", "\\translation text yes"
)