我在elisp工作,我有一个表示项目列表的字符串。字符串看起来像
"apple orange 'tasty things' 'my lunch' zucchini 'my dinner'"
我想把它分成
("apple" "orange" "tasty things" "my lunch" "zucchini" "my dinner")
This is a familiar problem。解决问题的障碍不在于正则表达式,更多的是关于elisp的具体细节。
我想要做的是运行一个循环:
(while (< (length my-string) 0) do-work)
do-work
所在的位置:
\('[^']*?'\|[[:alnum:]]+)\([[:space:]]*\(.+\)
应用于my-string
\1
附加到我的结果列表my-string
重新绑定到\2
但是,我无法弄清楚如何让split-string
或replace-regexp-in-string
这样做。
如何将此字符串拆分为可以使用的值?
(或者:“哪个内置的emacs功能,我还没找到?”)
答案 0 :(得分:5)
类似的东西,但没有正则表达式:
(defun parse-quotes (string)
(let ((i 0) result current quotep escapedp word)
(while (< i (length string))
(setq current (aref string i))
(cond
((and (char-equal current ?\ )
(not quotep))
(when word (push word result))
(setq word nil escapedp nil))
((and (char-equal current ?\')
(not escapedp)
(not quotep))
(setq quotep t escapedp nil))
((and (char-equal current ?\')
(not escapedp))
(push word result)
(setq quotep nil word nil escapedp nil))
((char-equal current ?\\)
(when escapedp (push current word))
(setq escapedp (not escapedp)))
(t (setq escapedp nil)
(push current word)))
(incf i))
(when quotep
(error (format "Unbalanced quotes at %d"
(- (length string) (length word)))))
(when word (push result word))
(mapcar (lambda (x) (coerce (reverse x) 'string))
(reverse result))))
(parse-quotes "apple orange 'tasty things' 'my lunch' zucchini 'my dinner'")
("apple" "orange" "tasty things" "my lunch" "zucchini" "my dinner")
(parse-quotes "apple orange 'tasty thing\\'s' 'my lunch' zucchini 'my dinner'")
("apple" "orange" "tasty thing's" "my lunch" "zucchini" "my dinner")
(parse-quotes "apple orange 'tasty things' 'my lunch zucchini 'my dinner'")
;; Debugger entered--Lisp error: (error "Unbalanced quotes at 52")
奖励:它还允许使用“\”转义引号,并且如果引号不平衡将报告它(到达字符串的末尾,但没有找到打开的引号的匹配)。
答案 1 :(得分:3)
这是使用临时缓冲区实现算法的简单方法。我不知道是否有办法使用replace-regexp-in-string
或split-string
执行此操作。
(defun my-split (string)
(with-temp-buffer
(insert string " ") ;; insert the string in a temporary buffer
(goto-char (point-min)) ;; go back to the beginning of the buffer
(let ((result nil))
;; search for the regexp (and just return nil if nothing is found)
(while (re-search-forward "\\('[^']*?'\\|[[:alnum:]]+\\)\\([[:space:]]*\\(.+\\)\\)" nil t)
;; (match-string 1) is "\1"
;; append it after the current list
(setq result (append result (list (match-string 1))))
;; go back to the beginning of the second part
(goto-char (match-beginning 2)))
result)))
示例:
(my-split "apple orange 'tasty things' 'my lunch' zucchini 'my dinner'")
==> ("apple" "orange" "'tasty things'" "'my lunch'" "zucchini" "'my dinner'")
答案 2 :(得分:3)
您可能想看一下split-string-and-unquote
。
答案 3 :(得分:0)
如果经常操作字符串,则应通过包管理器安装s.el
库,它会在常量API下引入大量字符串实用程序函数。对于此任务,您需要函数s-match
,其可选的第3个参数接受起始位置。然后,你需要一个正确的正则表达式,尝试:
(concat "\\b[a-z]+\\b" "\\|" "'[a-z ]+'")
\|
表示匹配构成单词的任一字母序列(\b
表示单词边界),或引号内的字母和空格序列。然后使用loop:
;; let s = given string, r = regex
(loop for start = 0 then (+ start (length match))
for match = (car (s-match r s start))
while match
collect match)
出于教育目的,我还使用递归函数实现了相同的功能:
;; labels is Common Lisp's local function definition macro
(labels
((i
(start result)
;; s-match searches from start
(let ((match (car (s-match r s start))))
(if match
;; recursive call
(i (+ start (length match))
(cons match result))
;; push/nreverse idiom
(nreverse result)))))
;; recursive helper function
(i 0 '()))
由于Emacs缺少tail call optimization,因此在大列表上执行它可能会导致stack overflow。因此,您可以使用do宏重写它:
(do* ((start 0)
(match (car (s-match r s start)) (car (s-match r s start)))
(result '()))
((not match) (reverse result))
(push match result)
(incf start (length match)))