Question

考虑以下日志文件，

FSDFFDSFFDSFDS VCXVCXVCX 3343022340 IT_ON FDSFR0W3EV VXDF03
DDSDS232323SD DSADFSDA SDA32323 SDADSDQ SDAFDSADS SDA DSADSE3QZCD
DDSDS232323SD DSADFSDA SDA32323 SDADSDQ SDAFDSADS SDA DSADSE3QZCD
DDSDSDEERWREF FSFDSDFFDS  SDA32323 SDADSDQ SDAFDSADS SDA DSADSE3Q
DDSDS232323SD DSADFSDA SDA32323 SDADSDQ SDAFDSADSDA 
DSADSE3QZCD FFDSFDAREDFS 23FDSFDDS  IT_ON FDSFR0W3EV VXDF03ETRRT
FFDSFDAREDFS 23FDSFDDSFK 3343022340 IT_OFF FDSFR0W3EV VXDF03ETRRT
DDSDSDEERWREF FSFDSDFFDS  SDA32323 SDADSDQ SDAFDSADS SDA DSADSE3QZCD
DDSDS232323SD DSADFSDA SDA32323 SDADSDQ SDAFDSADS SDA DSADSE3QZCD
FFDSFDAREDFS 23FDSFDDSFK 3343022340 IT_ON FDSFR0W3EV VXDF03ETRRT
FFDSFDAREDFS 23FDSFDDSFK 3343022340 IT_OFF FDSFR0W3EV VXDF03ETRRF
DDSDSDEERWREF FSFDSDFFDS  SDA32323 SDADSDQ SDAFDSADS SDA DSADSE3QZCD
DDSDS232323SD DSADFSDA SDA32323 SDADSDQ SDAFDSADS SDA DSADSE3QZCD
FFDSFDAREDFS 23FDSFDDSFK 3343022340 IT_ON FDSFR0W3EV VXDF03ETRRT
FFDSFDAREDFS 23FDSFDDSFK 3343022340 IT_OFF FDSFR0W3EV VXDF03ETRR
FFDSFDAREDFS 23FDSFDDSFK 3343022340 IT_OFF FDSFR0W3EV VXDF03ETRR

我必须计算IT_ON到IT_OFF和IT_OFF到IT_ON的转换次数，即

IT_ON to IT_OFF : 3
IT_OFF to IT_ON : 2

我一直在尝试使用* grep“IT_ON”*和* grep“IT_OFF”*和IF语句，但它有点复杂。有什么帮助吗？

Answer 1

awk '/IT_ON/ {on = 1; if (off) {on_to_off++}; off = 0} /IT_OFF/ {off = 1; if (on) {off_to_on++}; on = 0} END {print "IT_ON to IT_OFF :", on_to_off; print "IT_OFF to IT_ON :", off_to_on}' inputfile

分为多行：

awk '
    /IT_ON/ {
        on = 1; 
        if (off) {
            on_to_off++
        }; 
        off = 0
    } 
    /IT_OFF/ {
        off = 1; 
        if (on) {
            off_to_on++
        }; 
        on = 0
    } 
    END {
        print "IT_ON to IT_OFF :", on_to_off; 
        print "IT_OFF to IT_ON :", off_to_on
    }' inputfile

如果您需要使用ID来跟踪每个ID的转换，那么您可以对阵列使用相同的技术。此外，您可能需要使用标志在第一次看到时设置ON状态，以确保将初始ON计为关闭到开启的转换。

Answer 2

这是另一种方法：

 grep -Po "IT_(ON|OFF)" inputFile \
 | uniq | paste - - \
 | awk 'NR==1 && NF==2{print;f=1}END{if(f)printf "%3d\t%3d\n", NR,NR-1}'

输出格式：

IT_ON   IT_OFF
  3       2

Answer 3

假设您的数据文件名为data.log：

grep -Eo 'IT_(ON|OFF)' data.log | uniq | tail -n +2 |sort |uniq -c

输出：

3 IT_OFF
2 IT_ON

注释：

grep -Eo 'IT_(ON|OFF)' data.log $(: -E for extended regex, -o to only print matching part ) \
  | uniq                        $(: deduplicate adjacent items ) \
  | tail -n +2                  $(: drop the first line )        \
  | sort | uniq -c              $(: sort , then give a count for each unique item )

Answer 4

然而，不完全是你想要的，可能会有效：

sed -n 's/.*\(IT_ON\|IT_OFF\).*/\1/p' input | uniq > input.tmp
grep $(head -1 input.tmp) input.tmp | uniq -c
expr $(grep $(head -2 input.tmp | tail -1) input.tmp | wc -l) - 1
rm input.tmp

Answer 5

这是bash中的shell脚本，可以满足您的要求：

#!/bin/bash

testfile="test.txt"

uniques=$(command grep -o IT_O. $testfile | uniq)
count=$(echo "$uniques" | paste - - | grep -c "IT_O.[[:space:]]IT_O.")

if [[ ${uniques:0:5} = "IT_ON" ]]; then
    echo "IT_ON  -> IT_OFF: $count"
    echo "IT_OFF -> IT_ON : $(($count-1))"
else
    echo "IT_ON  -> IT_OFF: $(($count-1))"
    echo "IT_OFF -> IT_ON : $count"
fi

不幸的是，我不能花太多时间进行测试 - 请运行一些试验，看看它是否足够强大，可用于您的用例。

Answer 6

在awk中：

/IT_ON/        { on=1; }
on && /IT_OFF/ { offs++; on=0; off=1; }
off && /IT_ON/ { ons++; off=0; on=1; }
END {
  printf("ON to OFF: %d\nOFF to ON: %d\n", offs, ons);
}

返回：

ON to OFF: 3
OFF to ON: 2

你可以用任何语言实现相同的逻辑，包括shell，但这对我来说似乎最干净。

使用if语句的grep命令

6 个答案: