关于UNIX Grep Command

时间:2010-02-22 08:32:25

标签: shell awk unix

我需要编写一个shell脚本来选择/ exp / files目录中的所有文件(而不是目录)。对于目录中的每个文件,我想查找是否收到最后一行文件。文件中的最后一行是预告片记录。最后一行中的第三个字段是数据记录的数量,即2315(文件-2中的总行数(标题,尾部))。在我的unix shell脚本中,我想通过检查T来检查最后一行是否是预告片记录,并想检查文件中的行数是否等于(2315 + 2)。如果这是成功的,那么我想将文件移动到另一个目录/ exp / ready。

tail -1 test.csv 
T,Test.csv,2315,80045.96

同样在输入文件中,有时拖车记录的0或1个字段可以在双引号内

"T","Test.csv","2315","80045.96"
"T", Test.csv, 2212,"80045.96"
T,Test.csv,2315,80045.96

6 个答案:

答案 0 :(得分:1)

如果你想在文件被写入和关闭后移动那么你应该考虑使用inotify,incron,FAM,gamin等等。

答案 1 :(得分:1)

您可以使用以下内容测试最后一行的存在:

tail -1 ${filename} | egrep '^T,|^"T",' >/dev/null 2>&1
rc=$?

如果该行以$rcT,开头,那么"T",将为0,假设该值足以捕获预告片记录。

确定后,您可以使用以下内容提取行数

lc=$(cat ${filename} | wc -l)

您可以使用

获取预期的行数
elc=$(tail -1 ${filename} | awk -F, '{sub(/^"/,"",$3);print 2+$3}')

并比较两者。

所以,将所有这些结合在一起,这将是一个良好的开端。它输出文件本身(我的测试文件num[1-9].tst)以及一条消息,指示文件是否正常或为何不合适。

#!/bin/bash
cd /exp/files
for fspec in *.tst ; do
    if [[ -f ${fspec} ]] ; then
        cat ${fspec} | sed 's/^/   /'
        tail -1 ${fspec} | egrep '^T,|^"T",' >/dev/null 2>&1
        rc=$?
        if [[ ${rc} -eq 0 ]] ; then
            lc=$(cat ${fspec} | wc -l)
            elc=$(tail -1 ${fspec} | awk -F, '{sub(/^"/,"",$3);print 2+$3}')
            if [[ ${lc} -eq ${elc} ]] ; then
                echo '***' File ${fspec} is done and dusted.
            else
                echo '***' File ${fspec} line count mismatch: ${lc}/${elc}.
            fi
        else
            echo '***' File ${fspec} has no valid trailer.
        fi
    else
        ls -ald ${fspec} | sed 's/^/   /'
        echo '***' File ${fspec} is not a regular file.
    fi
done

示例运行,显示我使用的测试文件:

   H,Test.csv,other rubbish goes here
   this file does not have a trailer
*** File num1.tst has no valid trailer.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with all quotes and correct count
   "T","Test.csv","1","80045.96"
*** File num2.tst is done and dusted.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with all quotes but bad count
   "T","Test.csv","9","80045.96"
*** File num3.tst line count mismatch: 3/11.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with all quotes except T, and correct count
   T,"Test.csv","1","80045.96"
*** File num4.tst is done and dusted.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with no quotes on T or count and correct count
   T,"Test.csv",1,"80045.96"
*** File num5.tst is done and dusted.
   H,Test.csv,other rubbish goes here
   this file does have a traier with quotes on T only, and correct count
   "T",Test.csv,1,80045.96
*** File num6.tst is done and dusted.
   drwxr-xr-x+ 2 pax None 0 Feb 23 09:55 num7.tst
*** File num7.tst is not a regular file.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with all quotes except the bad count
   "T","Test.csv",8,"80045.96"
*** File num8.tst line count mismatch: 3/10.
   H,Test.csv,other rubbish goes here
   this file does have a trailer with no quotes and a bad count
   T,Test.csv,7,80045.96
*** File num9.tst line count mismatch: 3/9.

答案 2 :(得分:1)

此代码通过单次调用awk完成所有逻辑计算,这使得它非常高效。它也 NOT 硬编码2315的示例值,而是使用预告片行中包含的值,因为我认为这是你的意图。

如果您对结果感到满意,请务必删除echo

#!/bin/bash

for file in /exp/files/*; do
  if [[ -f "$file" ]]; then
    if nawk -F, '{v0=$0;v1=$1;v3=$3}END{gsub(/"/,"",v0);exit !(v1 == "T" && NR == v3+2)}' "$file"; then
      echo mv "$file" /ext/ready
    fi
  fi
done

更新

我必须添加{v0=$0;v1=$1;v3=$3},因为SunOS的awk实现不支持END {}访问字段变量($ 0,$ 1,$ 2等),而是必须保存到用户定义的变量如果你想在END {}里面处理它们。请参阅This awk feature comparison link

中第一个表格的最后一行

答案 3 :(得分:0)

这里没有方便的UNIX shell,但是

#!/bin/bash
files=$(find /exp/files -type f)

应将所有文件放在BASH数组中;然后在上面建议的 paxdiablo 中迭代遍历它们中的每一个应该让你排序

答案 4 :(得分:0)

destination=/exp/ready
for file in /exp/files/*.csv
do
    var=$(tail -1 "$file" | awk -F"," '{ gsub(/\042|\047/,"") }
    $1=="T" && $3 == "2315" { print "ok" }')
    if [ "$var" = "ok" ]; then
        echo mv "$file" "$destination"
    else
        echo "invalid: $file"
    fi
done

答案 5 :(得分:0)

#!/bin/bash

ex findready.sh <<'HERE'
  i#!/bin/bash/

  let NUMLINES=$(wc -l $1)
  let TRAILER=$(cat $1 | tail -1 | tr -d '"' | sed 's/^\(.\).*$/\1/')

  if [[ $NUMLINES -eq 2317 && $TRAILER == "T" ]] ; then
      mv $1 /exp/ready/$1
  fi
  .
  wq
HERE

chmod a+x findready.sh

find /exp/files/ -type f -name '*.csv' -exec ./findready.sh {} ';' > /dev/null 2>&1