下面是文件夹中的文件格式。
File format - fact_type_<key>_partid
fact_type_123_1
fact_type_123_2
fact_type_123_3
fact_type_123_4
fact_type_124_1
fact_type_124_2
fact_type_124_3
fact_type_124_4
..
fact_type_130_1
每个密钥应包含4个文件(i.e Key1 should have 4 files ending with 1, 2, 3 and 4).
键应顺序排列,例如,在上面的示例中,下一个文件应为fact_type_125_1
以上文件是从外部进程加载的,如果在开始键和结束键(4 files for each key and all keys starting 123 till 130)
之间没有所有文件,则下一个过程将失败。
现在正在使用cut命令并将数据复制到excel,然后找出所有丢失的键
ls -1a | cut -d '_' -f3 | sort | uniq
请帮助我使用命令在文件夹中对此进行验证。
答案 0 :(得分:1)
使用bash和GNU排序:
for f1 in fact_type_*; do
echo "${f1%_[0-9]}"
done | sort -u |\
while read -r f2; do
for ((i=1; i<=4; i++)); do
f="${f2}_${i}"
[[ ! -e "$f" ]] && echo "missing $f"
done
done
输出(例如):
missing fact_type_126_4 missing fact_type_127_1 missing fact_type_127_2 missing fact_type_127_4
答案 1 :(得分:0)
因此,约束条件:
每个密钥应包含4个文件
键应按顺序排列
所以我做到了:
脚本:
check() {
local keys
keys=$(
# find all the files
find "$1" -regex '.*/fact_type_[0-9]+_[0-4]' \
-type f -printf "%f\n" |
# extract the keys
cut -d_ -f3
)
if [ -z "$keys" ]; then
echo "No files found"
return 255
fi
local nonexisting
nonexisting=$(
# sort it
<<<"$keys" sort |
# extract first and last key only
sed -n '1p;$p' |
# generate sequence
xargs seq |
# append {1..4} to all keys
xargs -i printf "%s\n" "fact_type_{}_"{1..4} |
# print only nonexisting files
xargs -l sh -c '[ ! -e "$1" ] && printf "%s\n" "$1"' --
)
if [ -n "$nonexisting" ]; then
<<<"$nonexisting" xargs printf "File %s does not exists\n"
return "$(<<<"$nonexisting" wc -l)"
fi
}
touch fact_type_{123..130}_{1..4}
check . # all ok
rm fact_type_130_1
rm fact_type_125_4
check . # two files missing
将输出(第一个check .
不输出,第二个仅输出):
File fact_type_125_4 does not exists
File fact_type_130_1 does not exists
在repl上进行了测试。
答案 2 :(得分:0)
使用GNU awk表示数组数组和sorted_in:
$ cat tst.awk
BEGIN {
for (i=1; i<ARGC; i++) {
fname = ARGV[i]
split(fname,fparts,/_/)
key = fparts[3]
id = fparts[4]
ids[key][pid]
}
PROCINFO["sorted_in"] = "@ind_num_asc"
for (key in ids) {
if ( (prevKey != "") && (key != prevKey+1) ) {
printf "key gap: %s -> %s\n", prevKey, key | "cat>&2"
}
prevId = ""
idCnt = 0
for (id in ids[key]) {
if ( (prevId != "") && (id != prevId+1) ) {
printf "id gap: %s, %s -> %s\n", key, prevId, id | "cat>&2"
}
if (id !~ /^[1234]$/) {
printf "bad id: %s, %s\n", key, id | "cat>&2"
}
idCnt++
prevId = id
}
if (idCnt != 4) {
printf "bad id count: %s, %s\n", key, idCnt | "cat>&2"
}
prevKey = key
}
}
$ awk -f tst.awk *