Question

我有一个名为F的字符串列表：

（“hello word i'am walid”“goodbye madame”）=＆gt;此列表包含两个字符串

元素

我有另一个列表调用S就像这样（“word”“madame”）=＆gt;这包含两个单词

现在我想从列表F的每个字符串中删除列表S的元素，输出应该是这样的（“hello i'am walid”“goodbye”）

我发现了这个功能：

(defun remove-string (rem-string full-string &key from-end (test #'eql)
                  test-not (start1 0) end1 (start2 0) end2 key)
"returns full-string with rem-string removed"
(let ((subst-point (search rem-string full-string 
                         :from-end from-end
                         :test test :test-not test-not
                         :start1 start1 :end1 end1
                         :start2 start2 :end2 end2 :key key)))
(if subst-point
    (concatenate 'string
                 (subseq full-string 0 subst-point)
                 (subseq full-string (+ subst-point (length rem-string))))
    full-string)))

例如：（remove-string“walid”“hello i'am walid”）=＆gt;输出“hello i'am”

但是有问题

示例：

(remove-string "wa" "hello i'am walid") => the output "hello i'am lid"

但输出应该是这样的“你好我'walid”换句话说我不会从字符串中删除确切的单词

我有一个解决方案是使用

cl-ppcre:regex-replace-all "\\s*\\bwa\\b\\s*" "ba wa walid" " ")

它很棒，但有一个问题 cl-ppcre：regex-replace-all“\ s * \ bam \ b \ s *”“i'am wa walid”“”）=＆gt;“我'沃尔德'并且我不应该拥有”我'am wa walid'因为“我是”是一个全音词

Answer 1

您可以明确定义边界字符，而不是使用\b。下面我使用空格，逗号，字符串的开头或结尾，或句点作为边界字符。

(cl-ppcre:regex-replace-all 
   #?r"(\s|^|$|,|\.)(am)(\s|^|$|,|\.)" 
   "i'am wa walid" 
   #?r"\1 \3")

（nb：#?\r""来自cl-interpol:enable-interpol-syntax，让正则表达式更容易阅读）

如果我们使用interpol，那么我们的单词也可以包含一个拼接点：

(let ((word "am"))
  (cl-ppcre:regex-replace-all 
     #?r"(\s|^|$|,|\.)(${word})(\s|^|$|,|\.)" 
     "i'am wa walid" 
     #?r"\1 \3")

干杯，希望我回答正确的问题

使用正则表达式

1 个答案: