Question

我正在运行一个命令行工具，它返回这样的结果 -

data {   
  metric: 0   
  metric: 1234.5
  metric: 230499
  metric: 234234
} 
data {   
  metric: 0   
  metric: 6789  
  metric: 23526   
  metric: 234634767 
}

我想基本计算（1234.5 / 6789）..... 2结果中第2行之间的分数。这些数字可以是十进制数。请求将始终按此顺序。是否可以通过grep / sed？

Answer 1

这是一个不起眼的答案：Tcl。该输出的语法类似于Tcl语法，因此我们可以定义名为data的过程和名为metric:的过程，并像Tcl脚本一样执行该输出。你会这样运行它：

tclsh pct.tcl <(the process that produces the output)

“pct.tcl”脚本是：

#!/usr/bin/env tcl

set n 0
set values [dict create]

proc data {block} {
    uplevel 1 $block
    incr ::n
}

proc metric: {value} {
    dict lappend ::values $::n $value
}

source [lindex $argv 0]

foreach num [dict get $values 0] denom [dict get $values 1] {
    if {$denom == 0} {
        puts "$num / $denom = Inf"
    } else {
        puts [format "%s / %s = %.2f" $num $denom [expr {double($num) / $denom}]]
    }
}

输出：

0 / 0 = Inf
1234.5 / 6789 = 0.18
230499 / 23526 = 9.80
234234 / 234634767 = 0.00

Answer 2

您的要求之一似乎只是使用bash命令（grep，sed等）。但是你必须要知道你需要别的东西来做你的小数除法。最简单的选择是bc。

以下是我使用grep，sed，cut和bc的建议。我没有试图使它变得紧凑。从理论上讲，您应该只能使用一个大的sed命令！

./yourProgram | grep metric | sed -n 2~4p | sed -r 's/^\s+//' | cut -f2 -d' ' | sed 'N;s_\n_ / _' | bc -l

让我们慢慢来看看：

grep metric选择包含＆＃34; metric＆＃34;
sed -n 2~4p从第二行
sed -r 's/^\s+//'会抑制行首的空白字符。 -r是增强的正则表达式选项（使用\s和+），它不是强制性的，但要使其看起来更好。使用MacOS，您应该使用-E
cut -f2 -d' '选择每行的第二个字段（分隔符为空格）
sed 'N;s_\n_ / _'用＆＃34;替换换行符/＆＃34;。请注意，我们使用＆＃34; _＆＃34;而不是＆＃34; /＆＃34;能够不匹配＆＃34; /＆＃34;
bc -l执行操作

Answer 3

以下是使用awk：

的解决方案

#!/usr/bin/awk -f
BEGIN {
        FS=" *\n? *[a-zA-Z]*: *"
        RS="} *\n"
    }
NR<=2 { a[NR] = $3 }
END { print (a[1]/a[2]) }

您可以将该文件与命令一起使用：

$ awk -f <awk-file> <data-file>

或者你可以让它可执行并直接调用它。

awk将输入数据分为记录，而记录又分为字段。在开始时，我仔细制作记录和字段分隔符，以便有趣的度量标准位于记录的第3个字段中。（第一个字段是data {）

然后，对于第一个和第二个记录，我将第三个字段存储在数组a。

中

最后，我打印数组的第一个和第二个元素之间的比率。

更新：我设法将其归结为3行：

BEGIN { RS="} *\n" }
NR<=2 { a[NR] = $6 }
END { print (a[1]/a[2]) }

如果不设置字段分隔符，它将保持默认状态。因此$1为data，$2为{，$3为第一个metric:，$4为第一个数字，{ {1}}是第二个$5，metric:是我们想要的数字。

Answer 4

grep / sed无法执行算术评估，也无法设置状态变量 - 所以，不，这不是。基本上，它们不是为搜索和替换之外的任何东西而设计的。这个可以通过与head / bc /等耦合它们的特技来实现，但这非常不方便且易碎。

awk可以实现这一点（代码经过量身定制，符合生产级别，因此可以验证输入并符合DRY原则）：

function error(m){print m " at line " FNR ":`" $0 "'">"/dev/stderr";_error=1;exit 1;}
BEGIN{brace=0; #brace level
index_=0; #record index
v1="+NaN";v2=v1; #values; if either is not reassigned, the result will be NaN
first_section=0; #1st section ended
second_section=0; #2nd section ended
record_pattern="[[:space:]]*metric:[[:space:]]*([[:digit:]]+(\\.[[:digit:]]+)?)[[:space:]]*$";
}
END{if(_error)exit;
if (brace>0){error("invalid:unclosed section");}
if(!second_section){error("invalid:less than 2 sections present")}}
#section start
/^data[[:space:]]+\{[[:space:]]*$/{if(brace>0){error("invalid:nested brace");}brace+=1;next;}
#section end
/^\}[[:space:]]*$/{brace-=1;if(brace<0){error("invalid:unmatched brace")}index_=0;
if(!first_section){first_section=1;next;}
if(!second_section){second_section=1;}
next;}
#record
$0~record_pattern{
match($0,record_pattern,m); #awk cannot capture groups from the line pattern
if(brace==0)error("invalid:record outside a section");
if(index_==1){
  if(!first_section){v1=m[1];}
  else if(!second_section){v2=m[1];}}
 index_++;next;
}
#anything else
{error("invalid:unrecognized syntax");}
#in the very end and if there were no errors
END{print v1/v2;}

虽然perl和python中的等效程序可读性更高（因此可维护）。

Answer 5

这是一个Perl解决方案。

假设：

$ echo "$tgt"
data {   
  metric: 0   
  metric: 1234.5
  metric: 230499
  metric: 234234
} 
data {   
  metric: 0   
  metric: 6789  
  metric: 23526   
  metric: 234634767 
}

你可以在perl的'slurp'模式下使用正则表达式找到你想要的对：

$ echo "$tgt" | perl -0777 -lne '
@a=/^data\s+\{\s+(?:metric:[\s\d.]+){1}metric:\s+(\d+(?:\.\d+)?)/gm;
print $a[0]/$a[1]
'
0.181838267786125

在这种情况下，(?:metric:[\s\d.]+){1}，1中大括号内的值将选择哪一对; 1234.5和6789。

从字符串中提取数字并以Bash计算百分比

5 个答案: