我需要编写一个shell脚本来选择/ exp / files目录中的所有文件(而不是目录)。对于目录中的每个文件,我想查找是否收到最后一行文件。文件中的最后一行是预告片记录。最后一行中的第三个字段是数据记录的数量,即2315(文件-2中的总行数(标题,尾部))。在我的unix shell脚本中,我想通过检查T来检查最后一行是否是预告片记录,并想检查文件中的行数是否等于(2315 + 2)。如果这是成功的,那么我想将文件移动到另一个目录/ exp / ready。
tail -1 test.csv
T,Test.csv,2315,80045.96
同样在输入文件中,有时拖车记录的0或1个字段可以在双引号内
"T","Test.csv","2315","80045.96"
"T", Test.csv, 2212,"80045.96"
T,Test.csv,2315,80045.96
答案 0 :(得分:1)
如果你想在文件被写入和关闭后移动那么你应该考虑使用inotify,incron,FAM,gamin等等。
答案 1 :(得分:1)
您可以使用以下内容测试最后一行的存在:
tail -1 ${filename} | egrep '^T,|^"T",' >/dev/null 2>&1
rc=$?
如果该行以$rc
或T,
开头,那么"T",
将为0,假设该值足以捕获预告片记录。
确定后,您可以使用以下内容提取行数
lc=$(cat ${filename} | wc -l)
您可以使用
获取预期的行数elc=$(tail -1 ${filename} | awk -F, '{sub(/^"/,"",$3);print 2+$3}')
并比较两者。
所以,将所有这些结合在一起,这将是一个良好的开端。它输出文件本身(我的测试文件num[1-9].tst
)以及一条消息,指示文件是否正常或为何不合适。
#!/bin/bash
cd /exp/files
for fspec in *.tst ; do
if [[ -f ${fspec} ]] ; then
cat ${fspec} | sed 's/^/ /'
tail -1 ${fspec} | egrep '^T,|^"T",' >/dev/null 2>&1
rc=$?
if [[ ${rc} -eq 0 ]] ; then
lc=$(cat ${fspec} | wc -l)
elc=$(tail -1 ${fspec} | awk -F, '{sub(/^"/,"",$3);print 2+$3}')
if [[ ${lc} -eq ${elc} ]] ; then
echo '***' File ${fspec} is done and dusted.
else
echo '***' File ${fspec} line count mismatch: ${lc}/${elc}.
fi
else
echo '***' File ${fspec} has no valid trailer.
fi
else
ls -ald ${fspec} | sed 's/^/ /'
echo '***' File ${fspec} is not a regular file.
fi
done
示例运行,显示我使用的测试文件:
H,Test.csv,other rubbish goes here
this file does not have a trailer
*** File num1.tst has no valid trailer.
H,Test.csv,other rubbish goes here
this file does have a trailer with all quotes and correct count
"T","Test.csv","1","80045.96"
*** File num2.tst is done and dusted.
H,Test.csv,other rubbish goes here
this file does have a trailer with all quotes but bad count
"T","Test.csv","9","80045.96"
*** File num3.tst line count mismatch: 3/11.
H,Test.csv,other rubbish goes here
this file does have a trailer with all quotes except T, and correct count
T,"Test.csv","1","80045.96"
*** File num4.tst is done and dusted.
H,Test.csv,other rubbish goes here
this file does have a trailer with no quotes on T or count and correct count
T,"Test.csv",1,"80045.96"
*** File num5.tst is done and dusted.
H,Test.csv,other rubbish goes here
this file does have a traier with quotes on T only, and correct count
"T",Test.csv,1,80045.96
*** File num6.tst is done and dusted.
drwxr-xr-x+ 2 pax None 0 Feb 23 09:55 num7.tst
*** File num7.tst is not a regular file.
H,Test.csv,other rubbish goes here
this file does have a trailer with all quotes except the bad count
"T","Test.csv",8,"80045.96"
*** File num8.tst line count mismatch: 3/10.
H,Test.csv,other rubbish goes here
this file does have a trailer with no quotes and a bad count
T,Test.csv,7,80045.96
*** File num9.tst line count mismatch: 3/9.
答案 2 :(得分:1)
此代码通过单次调用awk完成所有逻辑计算,这使得它非常高效。它也 NOT 硬编码2315的示例值,而是使用预告片行中包含的值,因为我认为这是你的意图。
如果您对结果感到满意,请务必删除echo
。
#!/bin/bash
for file in /exp/files/*; do
if [[ -f "$file" ]]; then
if nawk -F, '{v0=$0;v1=$1;v3=$3}END{gsub(/"/,"",v0);exit !(v1 == "T" && NR == v3+2)}' "$file"; then
echo mv "$file" /ext/ready
fi
fi
done
我必须添加{v0=$0;v1=$1;v3=$3}
,因为SunOS的awk实现不支持END {}访问字段变量($ 0,$ 1,$ 2等),而是必须保存到用户定义的变量如果你想在END {}里面处理它们。请参阅This awk feature comparison link
答案 3 :(得分:0)
这里没有方便的UNIX shell,但是
#!/bin/bash
files=$(find /exp/files -type f)
应将所有文件放在BASH数组中;然后在上面建议的 paxdiablo 中迭代遍历它们中的每一个应该让你排序
答案 4 :(得分:0)
destination=/exp/ready
for file in /exp/files/*.csv
do
var=$(tail -1 "$file" | awk -F"," '{ gsub(/\042|\047/,"") }
$1=="T" && $3 == "2315" { print "ok" }')
if [ "$var" = "ok" ]; then
echo mv "$file" "$destination"
else
echo "invalid: $file"
fi
done
答案 5 :(得分:0)
#!/bin/bash
ex findready.sh <<'HERE'
i#!/bin/bash/
let NUMLINES=$(wc -l $1)
let TRAILER=$(cat $1 | tail -1 | tr -d '"' | sed 's/^\(.\).*$/\1/')
if [[ $NUMLINES -eq 2317 && $TRAILER == "T" ]] ; then
mv $1 /exp/ready/$1
fi
.
wq
HERE
chmod a+x findready.sh
find /exp/files/ -type f -name '*.csv' -exec ./findready.sh {} ';' > /dev/null 2>&1