我有一个角色向量
words <- c("somethingspan.", "..span?", "spanthank", "great to hear", "yourspan")
我正在尝试从矢量中的每个单词中删除span
AND标点符号
> something thank great to hear your
问题是,如果span
出现在我感兴趣的词之前或之后,则没有规则。此外,span
可以粘贴到:i)仅限字符(例如{{1} }}),仅标点符号(例如yourspan
)或字符和标点符号(例如..span?
)。
我搜索了SO的答案,但通常我看到请求删除整个单词(如here)或字母/标点符号之后/之前的字符串元素(如here)
任何帮助将不胜感激
答案 0 :(得分:2)
您可以使用
[[:punct:]]*span[[:punct:]]*
请参阅regex demo。
<强>详情
[[:punct:]]*
- 0+标点字符span
- 文字子字符串[[:punct:]]*
- 0+标点字符words <- c("somethingspan.", "..span?", "spanthank", "great to hear", "yourspan")
words <- gsub("[[:punct:]]*span[[:punct:]]*", "", words) # Remove spans
words <- words[words != ""] # Discard empty elements
paste(words, collapse=" ") # Concat the elements
## => [1] "something thank great to hear your"
如果删除不需要的字符串后只有空格元素,则可以使用words <- words[trimws(words) != ""]
(而不是words[words != ""]
)替换第二步。
答案 1 :(得分:1)
https://regex101.com/在这里你可以尝试一切。
clean_words<- gsub(pattern = "span",replacement = "",words, perl = T)
# if you want the sentence
sentence<-paste(clean_words, sep = " ", collapse = " ")
# to remove punctuation this regex only takes from A to z
clean_sentence<- gsub(pattern = "[^a-zA-Z ]",replacement = "",sentence, perl = T)
答案 2 :(得分:0)
使用sub
删除范围。要将其设为句子,请使用paste
和collapse
library(magrittr)
sub("^[[:punct:]]{,2}span|span[[:punct:]]{,2}$", "", words) %>% paste(collapse=" ")
所以它只删除开头或结尾的跨度。
[1] "something ? thank great to hear your"