此问题与:Convert long state names embedded with other text to two-letter state abbreviations
有关以下for循环代码效果很好。
for(r in 1:nrow(states.list)) {
states = sub(states.list[r,1], states.list[r,2], states)
}
states
[1] "Plano NJ" "NC" "xyz" "AL 02138" "TX" "Town IA 99999"
数据:
states <- c("Plano New Jersey", "NC", "xyz", "Alabama 02138", "Texas", "Town Iowa 99999")
states.list = structure(list(state.name = structure(c(4L, 1L, 5L, 2L, 3L), .Label = c("Alabama",
"Iowa", "Minnesota", "New Jersey", "Texas"), class = "factor"),
state.abb = structure(c(4L, 1L, 5L, 2L, 3L), .Label = c("AL",
"IA", "MN", "NJ", "TX"), class = "factor")), .Names = c("state.name",
"state.abb"), class = "data.frame", row.names = c(NA, -5L))
states.list
state.name state.abb
1 New Jersey NJ
2 Alabama AL
3 Texas TX
4 Iowa IA
5 Minnesota MN
我尝试过使用矢量解决方案,但它们不起作用:
apply(states.list, 1, function(x) {
sapply(states, function(y) {
sub( x[1], x[2], y
)
})
})
sapply(states, function(x) sub(states.list[,1], states.list[,2], x))
apply(states.list, 1, function(x) sub(x[1],x[2], states))
如何将其转换为矢量解决方案(使用apply等,而不使用任何特殊包)?谢谢你的帮助。
编辑: akrun解决方案的输出:
sapply ( seq_len(nrow(states.list)), function(i) {
+ sub(states.list[i,1], states.list[i,2], states[i])
+ })
[1] "Plano NJ" "NC" "xyz" "Alabama 02138" "Texas"
答案 0 :(得分:2)
我怀疑这可以被矢量化。最多可以将for
循环隐藏在*apply
等效项下,或者使用Reduce
,例如:
ARGS <- split(states.list, seq_len(nrow(states.list)))
FUN <- function(x, y) gsub(as.character(y$state.name),
as.character(y$state.abb), x)
Reduce(FUN, ARGS, states)
这很奇特,除了它是恕我直言不值得的努力:它可能不比for
循环更快,它更难理解,不是吗?在R中使用for
时有点太多的耻辱。