我有一列,末尾可能包含“ //”。可能是单个“ /”,也可能是三个或更多。如何删除字符串末尾的斜杠?
df <- structure(list(hej = c("UXCG40///", "UXCD00///", "UXCD00///",
"UXCC77///", "UXCC77///", "UXCA00///", "UXCD00///", "UXCC00/UXCD00//",
"UXCD00///", "UXCC00/UXCD00//", "UXCA00///", "UXCC00///", "UXCG40///",
"UXCC00/UXCD00//", "UXCE30///", "UXCD00///", "UXCD00///", "UXCC00///",
"UXCA00///")), row.names = c(NA, -19L), class = c("tbl_df", "tbl",
"data.frame"))
print(df[5: 19, ])
#> [1] "UXCC77///" "UXCA00///" "UXCD00///"
#> [4] "UXCC00/UXCD00//" "UXCD00///" "UXCC00/UXCD00//"
#> [7] "UXCA00///" "UXCC00///" "UXCG40///"
#> [10] "UXCC00/UXCD00//" "UXCE30///" "UXCD00///"
#> [13] "UXCD00///" "UXCC00///" "UXCA00///"
答案 0 :(得分:3)
只需指定+
,它暗示一个或多个匹配字符。在这种情况下,它是/
,还指定了位置$
(字符串结尾的元字符)-这样,它在任何其他位置都不会与/
相匹配>
library(dplyr)
library(stringr)
df1 <- df %>%
mutate(hej = str_remove(hej, "/+$"))
df1
# A tibble: 19 x 1
# hej
# <chr>
# 1 UXCG40
# 2 UXCD00
# 3 UXCD00
# 4 UXCC77
# 5 UXCC77
# 6 UXCA00
# 7 UXCD00
# 8 UXCC00/UXCD00
# 9 UXCD00
#10 UXCC00/UXCD00
#11 UXCA00
#12 UXCC00
#13 UXCG40
#14 UXCC00/UXCD00
#15 UXCE30
#16 UXCD00
#17 UXCD00
#18 UXCC00
#19 UXCA00
在base R
中,应该是sub
df$hej <- sub("/+$", "", df$hej)
答案 1 :(得分:2)
除了sub
之外,专用功能trimws
(内部调用sub
)也可以用于此目的:
trimws(c("UXCG40", "UXCG40/", "UXCG40///", "UXCC00/UXCD00//"), which = "right", whitespace = "/")
#> [1] "UXCG40" "UXCG40" "UXCG40" "UXCC00/UXCD00"