Question

因此，可以说我想在字符串中找到一个模式，如果该模式存在，那么我只将字符串的一部分保留在该模式之前。我的问题是，如果模式不存在，那么它将返回NA，最终结果将是NA。我希望它在模式不存在时返回原始字符串。

library(stringr)
library(dplyr)
unique(iris$Species)
#> [1] setosa     versicolor virginica 
#> Levels: setosa versicolor virginica

test <- iris %>%
  mutate(Species = str_sub(Species, 1, str_locate(Species, "t")[,1] ))

head(test)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2     set
#> 2          4.9         3.0          1.4         0.2     set
#> 3          4.7         3.2          1.3         0.2     set
#> 4          4.6         3.1          1.5         0.2     set
#> 5          5.0         3.6          1.4         0.2     set
#> 6          5.4         3.9          1.7         0.4     set
tail(test)
#>     Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 145          6.7         3.3          5.7         2.5    <NA>
#> 146          6.7         3.0          5.2         2.3    <NA>
#> 147          6.3         2.5          5.0         1.9    <NA>
#> 148          6.5         3.0          5.2         2.0    <NA>
#> 149          6.2         3.4          5.4         2.3    <NA>
#> 150          5.9         3.0          5.1         1.8    <NA>

^{由reprex package（v0.3.0）于2019-07-14创建}

Answer 1

我们可以对str_remove使用正则表达式查询。如果找不到该模式，它将返回原始字符串。在这里，我们在't'字符后匹配字符（.*），如果找到，这些字符将被删除

library(dplyr)
library(stringr)
test <- iris %>% 
          mutate(Species = str_remove(Species, "(?<=t).*")) 
head(test)
#  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#1          5.1         3.5          1.4         0.2     set
#2          4.9         3.0          1.4         0.2     set
#3          4.7         3.2          1.3         0.2     set
#4          4.6         3.1          1.5         0.2     set
#5          5.0         3.6          1.4         0.2     set
#6          5.4         3.9          1.7         0.4     set
tail(test)
#    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#145          6.7         3.3          5.7         2.5 virginica
#146          6.7         3.0          5.2         2.3 virginica
#147          6.3         2.5          5.0         1.9 virginica
#148          6.5         3.0          5.2         2.0 virginica
#149          6.2         3.4          5.4         2.3 virginica
#150          5.9         3.0          5.1         1.8 virginica

使用Stringr进行mutate中的字符串操作

1 个答案: