我有匹配函数的输出。在某些情况下,函数不能从匹配中选择两个或多个名称中的一个,因此将它们/ all存储在列中的向量中。
我想要完成的是选择列中矢量的第一个,第二个,第三个..实例来继续。
这是一个复制数据框:
string <- c("c(\"Kaskazini 'A'\", \"Kaskazini 'B'\")","c(\"Kabale\", \"Kabare\")","c(\"Kisoko\", \"Kisoro Tc\")",
"c(\"Luwero East\", \"Luwero West\")", "c(\"Marindi\", \"Malindi\")",c("c(\"Mukongoro\", \"Mukono Tc\", \"Muko\")")
)
testdf <- data.frame(string
)
答案 0 :(得分:1)
以下是使用正则表达式的简单方法:
# extract instances (in a list)
strings <- regmatches(testdf$string,
gregexpr("(?<=\")[^\"]+?(?=\"[,)])",
testdf$string, perl = TRUE))
[[1]]
[1] "Kaskazini 'A'" "Kaskazini 'B'"
[[2]]
[1] "Kabale" "Kabare"
[[3]]
[1] "Kisoko" "Kisoro Tc"
[[4]]
[1] "Luwero East" "Luwero West"
[[5]]
[1] "Marindi" "Malindi"
[[6]]
[1] "Mukongoro" "Mukono Tc" "Muko"
# add columns to `testdf`
testdf$first <- sapply(strings, "[", 1)
testdf$second <- sapply(strings, "[", 2)
testdf$third <- sapply(strings, "[", 3)
string first second third
1 c("Kaskazini 'A'", "Kaskazini 'B'") Kaskazini 'A' Kaskazini 'B' <NA>
2 c("Kabale", "Kabare") Kabale Kabare <NA>
3 c("Kisoko", "Kisoro Tc") Kisoko Kisoro Tc <NA>
4 c("Luwero East", "Luwero West") Luwero East Luwero West <NA>
5 c("Marindi", "Malindi") Marindi Malindi <NA>
6 c("Mukongoro", "Mukono Tc", "Muko") Mukongoro Mukono Tc Muko
如果您不想手动创建所有列或不知道最大实例数,可以使用以下方法:
res <- sapply(seq(max(sapply(strings, length))), function(x)
sapply(strings, "[", x))
cbind(testdf, res)
string 1 2 3
1 c("Kaskazini 'A'", "Kaskazini 'B'") Kaskazini 'A' Kaskazini 'B' <NA>
2 c("Kabale", "Kabare") Kabale Kabare <NA>
3 c("Kisoko", "Kisoro Tc") Kisoko Kisoro Tc <NA>
4 c("Luwero East", "Luwero West") Luwero East Luwero West <NA>
5 c("Marindi", "Malindi") Marindi Malindi <NA>
6 c("Mukongoro", "Mukono Tc", "Muko") Mukongoro Mukono Tc Muko
答案 1 :(得分:0)
我想这就是你想要的。
string <- c("c(\"Kaskazini 'A'\", \"Kaskazini 'B'\")","c(\"Kabale\", \"Kabare\")","c(\"Kisoko\", \"Kisoro Tc\")",
"c(\"Luwero East\", \"Luwero West\")", "c(\"Marindi\", \"Malindi\")",c("c(\"Mukongoro\", \"Mukono Tc\", \"Muko\")")
)
testdf <- data.frame(string)
#convert all quotes into pipe symbol for use as a delimiter
testdf$string <- gsub('"',"|",testdf$string)
#split the string using pipe
testdf$strsplit <- strsplit(testdf$string, "|",fixed=TRUE)
#extract first name using sapply
testdf$first <- sapply(testdf$strsplit, function(x) x[[2]])
#extract second name using sapply
testdf$second <- sapply(testdf$strsplit, function(x) x[[4]])