Question

我有一个字符串向量，我想用不同的子字符串替换所有字符串中的一个公共子字符串。我在R中这样做。例如：

input=c("I like fruits","I like you","I like dudes")
# I need to do something like this
newStrings=c("You","We","She")
gsub("I",newStrings,input)

以便输出看起来像：

"You like fruits"
"We like you"
"She like dudes"

但是，gsub只使用newStrings中的第一个字符串。有什么建议？感谢

Answer 1

您可以使用stringr：

stringr::str_replace_all(input, "I" ,newStrings)

[1] "You like fruits" "We like you"    
[3] "She like dudes"

或@ David Arenburg的建议：

stringi::stri_replace_all_fixed(input, "I", newStrings)

<强> BENCHMRK

library(stringi)
library(stringr)
library(microbenchmark)

set.seed(123)
x <- stri_rand_strings(1e3, 10)
y <- stri_rand_strings(1e3, 1)

identical(stringi::stri_replace_all_fixed(x, "I", y), stringr::str_replace_all(x, fixed("I") , y))
# [1] TRUE
identical(stringi::stri_replace_all_fixed(x, "I", y), diag(sapply(y, gsub, pattern = "I", x = x, fixed = TRUE)))
# [1] TRUE
identical(stringi::stri_replace_all_fixed(x, "I", y), mapply(gsub, "I", y, x, USE.NAMES = FALSE, fixed = TRUE))
# [1] TRUE

microbenchmark("stingi: " = stringi::stri_replace_all_fixed(x, "I", y),
               "stringr (optimized): " = stringr::str_replace_all(x, fixed("I") , y),
               "base::mapply (optimized): " = mapply(gsub, "I", y, x, USE.NAMES = FALSE, fixed = TRUE),
               "base::sapply (optimized): " = diag(sapply(y, gsub, pattern = "I", x = x, fixed = TRUE)))

# Unit: microseconds
#                       expr        min          lq        mean      median          uq        max neval cld
#                   stingi:     132.156    137.1165    171.5822    150.3960    194.2345    460.145   100  a 
#      stringr (optimized):     801.894    828.7730    947.1813    912.6095    968.7680   2716.708   100  a 
# base::mapply (optimized):    2827.104   2946.9400   3211.9614   3031.7375   3123.8940   8216.360   100  a 
# base::sapply (optimized):  402349.424 476545.9245 491665.8576 483410.3290 513184.3490 549489.667   100   b

Answer 2

在这些情况下，

mapply()非常有用：

mapply(sub, "I", newStrings, input, USE.NAMES = FALSE,fixed=T)
# [1] "You like fruits" "We like you"     "She like dudes"

Answer 3

您可以将sapply用于此

diag(sapply(newStrings,gsub,pattern="I",x=input))

如何用R中的不同子串替换一个子串？

3 个答案: