搜索特定值并在linux中对未排序数据进行计数

时间:2016-05-10 08:18:25

标签: linux shell scripting

我有一个文件,其中包含以下数据:

1,20160507057,VBATCH_20160507_00001,1000,GGG,OR1,20160507,ATP,VS12,TEST,Ver,

2,AVAILABLE,20160507T13:23:19,ver,,

2,USED,20160507T16:45:00,,12394301044,803123123314626251006

1,20160507331,VBATCH_20160507_00003,1000,GGG,OR1,20160508,ATP,Pure,vour,Test,

2,POP,20160507T16:10:27,ver,,

2,AVAILABLE,20160507T16:17:42,ver,,

1,20160507441,VBATCH_20160507_00003,1000,GGG,OR1,20160508,ATP,Pure,vour,Test,

2,POP,20160507T16:10:27,ver,,

2,AVAILABLE,20160507T16:17:42,ver,,

记录从第一行开始:

1,20160507331,VBATCH_20160507_00003,1000,GGG,OR1,20160508,ATP,Pure,vour,Test,

这是上述记录的子行:

2,POP,20160507T16:10:27,ver,,

2,AVAILABLE,20160507T16:17:42,ver,,

因此,对于每个起始行,都会有一些行跟随它,所以我的要求是我需要以下值:

以可用结尾的记录的最后一行我需要该记录的所有数据和第二列(第一行)

示例:

1,20160507331,VBATCH_20160507_00003,1000,GGG,OR1,20160508,ATP,Pure,vour,Test,

2,POP,20160507T16:10:27,ver,,

2,AVAILABLE,20160507T16:17:42,ver,,

以上记录只有我应该考虑。

输出:

20160507331  Available 

2 个答案:

答案 0 :(得分:0)

创建文件:test_script.py

import sys

with open(sys.argv[1], 'r') as f:
    last_id = None
    last_value = None
    for line in f:
        if line.startswith('1,'):
            if last_id != None and last_value == 'AVAILABLE':
                print last_id, last_value
            last_id = line.split(',')[1]
        elif line.startswith('2,'):
            last_value = line.split(',')[1]
    if last_id != None and last_value == 'AVAILABLE':
        print last_id, last_value

然后运行cmd:python test_script.py your_file_path

我希望它可以帮到你。

答案 1 :(得分:0)

source=$1

while read line
do

column_width=$(echo $line |awk -F, '{print NF}')

if [ "$column_width" -eq 12 ];then

grep  -A2 `echo  $line` $source|tail -1 |grep -q  AVAILABLE
if [ "$?" -eq 0 ];then

id=$(echo $line |awk -F, '{print $2}')
echo "$id AVAILABLE"
fi

fi

done < $source

使用它像:

./script FileName.txt