Question

我有一个包含以下内容的数据框：

Column 1

London.(Sessions)
Birmingham.(Sessions)
Leeds.(Sessions)

如何删除字符串，以便最终得到这个

Column 1

London
Birmingham
Leeds

到目前为止，我已经使用以下代码：

stacked_sessions<-stacked_sessions%>%
 mutate_all(~gsub("(Sessions)", "", .))%>%
 mutate_all(funs(str_replace_all(.,'[\\.,]','')))

我得到并输出

London()
Birmingham()
Leeds()

Answer 1

要删除"."之后的所有内容吗？

df$Column1 <- sub('\\..*', '', df$Column1)
df

#     Column1
#1     London
#2 Birmingham
#3      Leeds

stringr中的等效项正在使用str_remove：

df$Column1 <- stringr::str_remove(df$Column1, "\\..*")

数据

df <-  structure(list(Column1 = c("London.(Sessions)", "Birmingham.(Sessions)", 
"Leeds.(Sessions)")), class = "data.frame", row.names = c(NA, -3L))

Answer 2

我们可以使用trimws中的base R

df$Column1 <-  trimws(df$Column1, whitespace = '\\..*')
df$Column1
#[1] "London"     "Birmingham" "Leeds"

或在regmatches/regexpr中使用base R

regmatches(df$Column1, regexpr("^[^.]+", df$Column1))
#[1] "London"     "Birmingham" "Leeds"

或者使用str_extract中的stringr

library(stringr)
str_extract(df$Column1, "^\\w+")  
#[1] "London"     "Birmingham" "Leeds"

df <-  structure(list(Column1 = c("London.(Sessions)", "Birmingham.(Sessions)", 
"Leeds.(Sessions)")), class = "data.frame", row.names = c(NA, -3L))