Question

我有一个如下所示的日志文件：

 Jan 1 06:09:23 somefile.txt
 Jan 2 12:18:27 somefile1.txt
 Jan 3 04:16:00 somefile2.txt

我想找到每个文件并将每个文件的完整路径插入到此文件中。我认为find，awk和sed有一些组合可以实现这一目标，但到目前为止我还没有找到一个可行的解决方案来更新文件，如下所示。

Jan 1 06:09:23 /path/to/file/somefile.txt
Jan 2 12:18:27 /path/to/file1/somefile1.txt
Jan 3 04:16:00 /path/to/file2/somefile2.txt

我已经能够删除文件名并找到没有问题的文件，但到目前为止我提出的文件写出了一个新文件并丢失了原始文件内容。我原本希望保持原始档案。

#!/bin/bash
#functions
getup(){

for i in `cat /home/work/uploadtmp`
do
     find /home/uploads/*$i 2> /dev/null >> /home/work/upfile
done
}

listfile(){
while read line; do ls -lt $line; done < /home/work/upfile

}

#run functions
getup
listfile | awk '{print $1 " " $2 " " $3 " " $4}' | sort -k1M -k2 -k3 > /home/log/newfile

Answer 1

# create a temporary output file, so we only overwrite the destination when complete
tempfile=$(mktemp /home/log/newfile.XXXXXX)

# ...and tell the shell to delete that temporary file if it's still around when we exit
# ...won't work for SIGKILL or power failures, but better than nothing.
trap 'rm -f -- "$tempfile"' EXIT

# iterate over lines in the input file...
while read -r mon day time filename; do
  # ...quoting each name to only match itself...
  filename_pat=$(sed -e 's@[]*?[]@\\&@g'  <<<"$filename")
  # ...using find to locate the first file with the given name for each...
  fullname=$(find /home/uploads -name "$filename_pat" -print -quit)
  # ...and printing that new name on our stdout
  printf '%s\n' "$mon $day $time $fullname"
done </home/work/uploadtmp >"$tempfile" # ...redirecting the whole loop to our tempfile...

# ...then performing a single atomic rename to overwrite the final destination
mv "$tempfile" /home/log/newfile

Answer 2

在awk中，在外部使用find来收集文件路径：

$ cat program.awk
NR==FNR {                 # read in the files file records
    a[$NF]=$0; next }         # hash them to a and tskip o the next record
{                         # find produced list processing
    n=split($0,b,"/");        # basename functionality, filename part in b[n]
    if(sub(b[n],$0,a[b[n]]))  # replace filename in a with full path version
        print a[b[n]]         # and print
}
$ awk -f program.awk files <(find .)
Jan 3 04:16:00 ./file2/somefile2.txt
Jan 1 06:09:23 ./file/somefile.txt
Jan 2 12:18:27 ./file1/somefile1.txt

此解决方案（或旧解决方案）不会容忍文件名中的空间。但是，通过放弃$NF用法：

，它在第一个块中很容易解决

f=$0                      # current record to var f
sub(/^([^ ]+ ){3}/,"",f)  # remove timestamp
a[f]=$0                   # hash to a on f
next                      # ...

旧版本 @CharlesDuffy评论中的批评（++ for it）。留在这里用于教育目的：

$ awk -v path=".." '{ s="find " path " -name " $NF; s | getline $NF } 1' file
Jan 1 06:09:23 ../test/file/somefile.txt
Jan 2 12:18:27 ../test/file1/somefile1.txt
Jan 3 04:16:00 ../test/file2/somefile2.txt

find命令字符串被收集到var s
并将输出写回最后一个字段（$NF）

bash更新文件的完整路径文件列表

2 个答案: