字符串问题:删除字符串末尾的模式“ //”,“ ///”,“ ////”

时间:2019-07-14 14:46:44

标签: r stringr

我有一列,末尾可能包含“ //”。可能是单个“ /”,也可能是三个或更多。如何删除字符串末尾的斜杠?


df <- structure(list(hej = c("UXCG40///", "UXCD00///", "UXCD00///", 
                             "UXCC77///", "UXCC77///", "UXCA00///", "UXCD00///", "UXCC00/UXCD00//", 
                             "UXCD00///", "UXCC00/UXCD00//", "UXCA00///", "UXCC00///", "UXCG40///", 
                             "UXCC00/UXCD00//", "UXCE30///", "UXCD00///", "UXCD00///", "UXCC00///", 
                             "UXCA00///")), row.names = c(NA, -19L), class = c("tbl_df", "tbl", 
                                                                               "data.frame"))



print(df[5: 19, ])
#>  [1] "UXCC77///"       "UXCA00///"       "UXCD00///"      
#>  [4] "UXCC00/UXCD00//" "UXCD00///"       "UXCC00/UXCD00//"
#>  [7] "UXCA00///"       "UXCC00///"       "UXCG40///"      
#> [10] "UXCC00/UXCD00//" "UXCE30///"       "UXCD00///"      
#> [13] "UXCD00///"       "UXCC00///"       "UXCA00///"

2 个答案:

答案 0 :(得分:3)

只需指定+,它暗示一个或多个匹配字符。在这种情况下,它是/,还指定了位置$(字符串结尾的元字符)-这样,它在任何其他位置都不会与/相匹配

library(dplyr)
library(stringr)
df1 <- df %>%
          mutate(hej = str_remove(hej, "/+$"))    

df1
# A tibble: 19 x 1
#   hej          
#   <chr>        
# 1 UXCG40       
# 2 UXCD00       
# 3 UXCD00       
# 4 UXCC77       
# 5 UXCC77       
# 6 UXCA00       
# 7 UXCD00       
# 8 UXCC00/UXCD00
# 9 UXCD00       
#10 UXCC00/UXCD00
#11 UXCA00       
#12 UXCC00       
#13 UXCG40       
#14 UXCC00/UXCD00
#15 UXCE30       
#16 UXCD00       
#17 UXCD00       
#18 UXCC00       
#19 UXCA00       

base R中,应该是sub

df$hej <- sub("/+$", "", df$hej)

答案 1 :(得分:2)

除了sub之外,专用功能trimws(内部调用sub)也可以用于此目的:

trimws(c("UXCG40", "UXCG40/", "UXCG40///", "UXCC00/UXCD00//"), which = "right", whitespace = "/")
#> [1] "UXCG40"        "UXCG40"        "UXCG40"        "UXCC00/UXCD00"