TCL:regexp排除包含字符的字符串

时间:2012-11-22 11:34:57

标签: tcl

我对TCL或使用TCL regexp的熟练程度不高。但我需要一个TCL机制/正则表达式,如果给出一个行/句子,它可以排除或让我们知道一个单词有一些特殊字符。

假设我的行/句子如下所示:

 (space)(space)At 4:00:00AM (not sure) please do your work ...

现在我尝试拆分行以使用foreach来循环每个单词:

% set fields [split "   At 4:00:00AM (not sure) please do your work" " " ]
{} {} {} At 4:00:00AM (not sure) please do your work

但我再次不想要空字段:

% foreach val $fields {
       puts $val
}



At
4:00:00AM
(not
sure)
please
do
your
work

除此之外,我想在foreach循环中排除具有特殊字符的单词,如:

(not
sure)
4:00:00AM

排除在单词的开头,结尾或任何位置加上'('或':'的单词。

请让我如何实现这一目标。

2 个答案:

答案 0 :(得分:1)

set str "   At 4:00:00AM (not sure) please do your work"

# split the string into space-delimited words
set words [regexp -inline -all {\S+} $str]

# eliminate words containing a character other than letters, numbers, underscore
set alnum_words [lsearch -inline -regexp -all -not $words {\W}]

alnum_words现在包含列表{At please do your work}

如果您只想要仅由字母组成的单词,请使用

lsearch -inline -regexp -all $words {^[[:alpha:]]+$}

答案 1 :(得分:0)

不幸的是,Tcl regexp不支持后视运算符。否则可以通过单个正则表达式实现。 但是,您可以使用以下代码构建所需的单词列表:

set the_line "   At 4:00:00AM (not sure) please do your work"
set fields {}
foreach {- val} [regexp -all -inline -- {(?:^|\s)([^:()\s]+(?=\s|$))} $the_line] {
    lappend fields $val
}