我正在尝试拆分字符串
Change 709131 on 2014/06/05 by person1
- some description
Change 709081 on 2014/06/05 by person2
more description
Change 708930 on 2014/06/04 by person3
description xyz
Change 708906 on 2014/06/04 by person4
description of change
我想从Change \d+
分开(这意味着更改709081等)。
我正在尝试使用
set abc [regexp -inline -all {Change \d+\son.*Change \d+\son} $oIfs]
我没有得到所需的输出
编辑:我发现的一种方式是
set abc [regexp -inline -all {Change.*?(?=Change)} $oIfs]
但它没有给出声明的最后部分。
答案 0 :(得分:1)
答案 1 :(得分:1)
Tcllib救援:http://tcllib.sourceforge.net/doc/textutil_split.html
package require textutil::split
set s {Change 709131 on 2014/06/05 by person1
- some description
Change 709081 on 2014/06/05 by person2
more description
Change 708930 on 2014/06/04 by person3
description xyz
Change 708906 on 2014/06/04 by person4
description of change}
foreach {chg desc} [lrange [textutil::split::splitx $s {(Change \d+)}] 1 end] {lappend changes "$chg$desc"}
set i 0
foreach chg $changes {puts "[incr i]> $chg"}
1> Change 709131 on 2014/06/05 by person1
- some description
2> Change 709081 on 2014/06/05 by person2
more description
3> Change 708930 on 2014/06/04 by person3
description xyz
4> Change 708906 on 2014/06/04 by person4
description of change
答案 2 :(得分:1)
解决问题的一种方法是逐行处理数据并构建“记录”。当您遇到记录的开头时,对先前的记录执行某些操作,然后重置(即构建新记录)。以下是一些建议的代码:
set data {Change 709131 on 2014/06/05 by person1
- some description
Change 708906 on 2014/06/04 by person4
description of change
}
proc do_something {record} {
# Process a record, in this case, just print it out with separators
if {[llength $record] == 0} { return }
puts "----------------"
foreach line $record {
puts $line
}
}
set record [list]
foreach line [split $data \n] {
if {[regexp {^Change \d+} $line]} {
# Encounter the start of a record, process the previous record
# and start a new record
do_something $record
set record [list]
}
lappend record "$line"
}
# Process the last record
if {[llength $record] != 0} { do_something $record }
答案 3 :(得分:1)
这是一个棘手的正则表达式,但它适用于您的示例数据:
regexp -all -inline {(?w)^Change.*?(?:\Z|\n(?=Change))} $sampleData
看看RE本身的各个部分:
(?w) # "Weird" mode; ^ and $ are line anchored but . matches newlines
^Change # "Change" at the start of a line...
.*? # and as few extra characters as possible, until...
(?: # (start non-capturing group)
\Z # ... the end of the whole string...
| # or...
\n # ... newline, followed by...
(?=Change) # ... "Change" (as zero-width lookahead)
) # (end non-capturing group)
使用您的样本数据:
% regexp -all -inline {(?w)^Change.*?(?:\Z|\n(?=Change))} $sampleData
{Change 709131 on 2014/06/05 by person1
- some description
} {Change 709081 on 2014/06/05 by person2
more description
} {Change 708930 on 2014/06/04 by person3
description xyz
} {Change 708906 on 2014/06/04 by person4
description of change}
对我来说还不错。假设没有人将“Change
”直接放在描述中的行首。