使用bash从自定义(文本)日志文件的末尾提取特定文本

时间:2019-06-19 11:27:16

标签: awk

我只想要最近的五个数字(这是文件的底部,可能有数百个这样的组)

{Previous entries would be above, perhaps hundreds of similar groups....}

[June 18, 2019, 12:37 pm Europe/Madrid +0200]
--------------------------------------------------

Added: 2
Modified: 3
Deleted: 1
Excluded: 2
Total Time: 5.09

[June 19, 2019, 12:37 pm Europe/Madrid +0200]
---------------------------------------------------

Added: 3
Modified: 0
Deleted: 2
Excluded: 1
Total Time: 6.18

如何从此文件中提取数字?我尝试了使用sed的各种方法,但是特别是仅抓住了最后五个值使我难以理解。

我正在寻找的输出是:

echo "<added>3</added>";
echo "<modified>0</modified>";
echo "<deleted>2</modified>";
echo "<excluded>1</excluded>";
echo "<total>6.18</total>";

我被要求具体说明我的尝试,因为我在业余时间在这里为其他人提供了200多个答案,这一事实并不排除我像小学生一样受到考验,所以这里你是...这行不通:

echo $file | awk -F'Added:' '{print $2}'

我希望这是有用的。

2 个答案:

答案 0 :(得分:1)

使用bash和正则表达式:

tail -n 5 file | while read -r line; do [[ $line =~ (.*):\ (.*) ]]; echo "${BASH_REMATCH[1]} ${BASH_REMATCH[2]}"; done

输出:

Added 3
Modified 0
Deleted 2
Excluded 1
Total Time 6.18

更新:

tail -n 5 file | while read -r line; do [[ $line =~ ([^\ ]*).*:\ (.*) ]]; echo "echo \"<${BASH_REMATCH[1],,}>${BASH_REMATCH[2]}</${BASH_REMATCH[1],,}>\""; done

输出:

echo "<added>3</added>"
echo "<modified>0</modified>"
echo "<deleted>2</deleted>"
echo "<excluded>1</excluded>"
echo "<total>6.18</total>"

答案 1 :(得分:1)

$ awk -F'[: ]+' '
    NF { tags[++numTags]=tolower($1); vals[numTags]=$NF; next }
    { numTags=0 }
    END {
        for (tagNr=1; tagNr<=numTags; tagNr++) {
            printf "echo \"<%s>%s</%s>\";\n", tags[tagNr], vals[tagNr], tags[tagNr]
        }
    }
' file
echo "<added>3</added>";
echo "<modified>0</modified>";
echo "<deleted>2</deleted>";
echo "<excluded>1</excluded>";
echo "<total>6.18</total>";