使用正则表达式提取多个值

时间:2017-08-30 19:15:30

标签: regex tcl distinct-values

你能帮我解决一下这个正则表达式。我的输出看起来像这样:

Wed Aug 30 14:47:11.435 EDT 

  Interface : p16, Value Count : 9 
  References : 1, Internal : 0x1 
  Values : 148, 365, 366, 367, 371 
        120577, 120578, 120631, 120632 

我需要从该输出中提取所有数字。可能会有更多或更少的价值,那么已经存在的价值。 到目前为止,我有这个(但它只提取最后一个值):

\s+Values\s+:\s+((\d+)(?:,?)(?:\s+))+

谢谢

编辑:添加完整输出。

5 个答案:

答案 0 :(得分:3)

正如@dawg所提到的,你需要在Tcl中采用两步法,因为它的正则表达式不允许在同一个组中存储多个捕获,并且它不支持\G运算符。

以下是最终解决方案:

set text {Wed Aug 30 14:47:11.435
EDT Interface : p16,
Value Count : 9 References : 1, Internal : 0x1
Values : 148, 365, 366, 367, 371
         120577, 120578, 120631, 120632}

set pattern {\sValues\s*:\s*\d+(?:[\s,]*\d+)*} 
regexp $pattern $text match
if {[info exists match]} {
    set results [regexp -all -inline {\d+} $match]
    puts $results
} else {
    puts "No match"
}

请参阅Tcl demo打印148 365 366 367 371 120577 120578 120631 120632

<强>详情

第一个匹配操作符提取以Values开头的子字符串,然后用逗号或空格分隔数字:

  • \s - 空白
  • Values - Values
  • \s*:\s* - 用0 +空格包围的冒号
  • \d+ - 一位或多位
  • (?:[\s,]*\d+)* - 0+个空格或0个序列,后跟1+个数字。

第二步是使用regexp -all -inline {\d+} $match提取所有1位数的数据块。

答案 1 :(得分:2)

Assuming the string is in the variable serialize :genre Genre:

GenreType < ActiveRecord::Type

That is: pick all the text between the last colon and the end of the string (strictly: the longest sequence of characters (from a set that excludes the colon) that is anchored by the end of the string). From this text, match all groups of digits. This is a similar solution to Wiktor's, but uses a somewhat less intricate pattern for the match in the first step. There is no problem if there is no match, since that will only mean that you get an empty list of number in the second step.

Documentation: regexp, Syntax of Tcl regular expressions

答案 2 :(得分:0)

[0-9]

这是正则表达式,它只匹配字符串中的数字。它匹配那里的每个数字。

答案 3 :(得分:0)

为什么不匹配\d+(每组一个或多个数字)?

答案 4 :(得分:0)

Assuming that you are searching for all numbers after the string "Values :", and that there is nothing else after those numbers, you can do it with the usual string commands. This returns a list containing the numbers:

bootstrap.servers=localhost:2181

Reading it from the inside out, you search for the index of the "Values :" string. You then grab the string from that index plus 8, until the end of the string. Then you use string map to replace any newlines with a comma. Finally you use split to convert the string to a list, using the comma as a delimiter.