Question

我尝试使用;输出分号（[root@example ~]# cat foo.csv john doe; lawyer; section 4 stand 356; area 5 chris thomas; carpenter; stand 289 section 2; area 5 tom sawyer; politician; stan 210 section 4; area 6）分隔文本文件中的行，其中第三个字段包含某个范围内的数字。 e.g。

awk

我希望var result=document.getElementById('result').value; result=num1+num2;向我提供第三个字段包含200到300之间的数字的所有行，而不管该字段中的其他文本。

Answer 1

您可以使用正则表达式，如下所示：

awk -F\; '$3 ~ /\y2[0-9][0-9]\y/' a.csv

一个更好的版本，允许您在命令行中简单地传递边界而不更改正则表达式，如下所示：

（因为它是一个更复杂的脚本，我建议将其保存到文件中）

filter.awk

BEGIN { FS=";" }

{
    # Split the 3rd field by sequences of non-numeric characters
    # and store the pieces in 'a'. 'a' will contain the numbers
    # of the 3rd field (plus an optional empty strings if $3 does
    # not start or end with a number)
    split($3, a, "[^0-9]+")

    # iterate through a and check if a number is within the range
    for(i in a){
        if(a!="" && a[i]>=low && a[i]<high){
            print
            next
        }
    }
}

这样称呼：

awk -v high=300 -v low=200 -f filter.awk a.csv

Answer 2

grep 替代方案：

grep '^[^;]*;[^;]*;[^;]*\b2[0-9][0-9]\b' foo.csv

输出：

chris thomas; carpenter;  stand 289 section 2; area 5
tom sawyer; politician; stan 210 section 4; area 6

如果300应该包含边界，您可以使用以下内容：

grep '^[^;]*;[^;]*;[^;]*\b\(2[0-9][0-9]\|300\)\b' foo.csv

AWK字段包含数字范围

2 个答案: