我对TCL或使用TCL regexp的熟练程度不高。但我需要一个TCL机制/正则表达式,如果给出一个行/句子,它可以排除或让我们知道一个单词有一些特殊字符。
假设我的行/句子如下所示:
(space)(space)At 4:00:00AM (not sure) please do your work ...
现在我尝试拆分行以使用foreach来循环每个单词:
% set fields [split " At 4:00:00AM (not sure) please do your work" " " ]
{} {} {} At 4:00:00AM (not sure) please do your work
但我再次不想要空字段:
% foreach val $fields {
puts $val
}
At
4:00:00AM
(not
sure)
please
do
your
work
除此之外,我想在foreach循环中排除具有特殊字符的单词,如:
(not
sure)
4:00:00AM
排除在单词的开头,结尾或任何位置加上'('或':'的单词。
请让我如何实现这一目标。
答案 0 :(得分:1)
set str " At 4:00:00AM (not sure) please do your work"
# split the string into space-delimited words
set words [regexp -inline -all {\S+} $str]
# eliminate words containing a character other than letters, numbers, underscore
set alnum_words [lsearch -inline -regexp -all -not $words {\W}]
alnum_words
现在包含列表{At please do your work}
如果您只想要仅由字母组成的单词,请使用
lsearch -inline -regexp -all $words {^[[:alpha:]]+$}
答案 1 :(得分:0)
不幸的是,Tcl regexp不支持后视运算符。否则可以通过单个正则表达式实现。 但是,您可以使用以下代码构建所需的单词列表:
set the_line " At 4:00:00AM (not sure) please do your work"
set fields {}
foreach {- val} [regexp -all -inline -- {(?:^|\s)([^:()\s]+(?=\s|$))} $the_line] {
lappend fields $val
}