Question

假设我们在一个目录中的prod unix机器（Sunos）上有多个.log文件：例如：

ls -tlr                                                                                                                                                                                                                     
total 0                                                                                                                                                                                                                               
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 file2017-01.log                                                                                                                                                                               
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 file2016-02.log                                                                                                                                                                               
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 todo2015-01.log                                                                                                                                                                               
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 fix20150223.log

这里的目的是通过nawk我从日志中提取特定信息（解析日志）并将它们“转换”为.csv文件，以便之后将它们加载到ORACLE表中。虽然nawk已经过测试并且像魅力一样工作，但我怎么能自动化执行以下操作的bash脚本：

1）获取此路径中给定文件的列表

2）nawk（从日志文件中提取特定数据/信息）

3）将每个文件的分别输出到唯一的.csv 到另一个目录

4）从此路径中删除.log文件

我关心的是每个文件结尾的loadstamp / timestamp是不同的。我已经实现了一个仅适用于最新日期的脚本。（例如，上个月）。但我想加载所有历史数据，我有点卡住了。

为了可视化，我的所需/目标输出为：

bash-4.4$ ls -tlr                                                                                                                                                                                                                     
total 0                                                                                                                                                                                                                               
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 file2017-01.csv                                                                                                                                                                               
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 file2016-02.csv                                                                                                                                                                               
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 todo2015-01.csv                                                                                                                                                                               
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 fix20150223.csv

怎样才能实现这个bash脚本？装载只需要一次，这是历史上提到的。任何帮助都非常有用。

Answer 1

为可读性而不是简洁性而编写的实现可能如下所示：

#!/usr/bin/env bash
for infile in *.log; do
  outfile=${infile%.log}.csv
  if awk -f yourscript <"$infile" >"$outfile"; then
    rm -f -- "$infile"
  else
    echo "Processing of $infile failed" >&2
    rm -f -- "$outfile"
  fi
done

要了解其工作原理，请参阅：

Globbing - 将*.log替换为具有该扩展名的文件列表的机制。
The Classic for Loop - for infile in语法，用于迭代上面的glob的结果。
Parameter expansion - ${infile%.log}语法，用于展开infile变量的内容，并修剪任何.log后缀。
Redirection - <"$infile"和>"$outfile"中使用的语法，打开附加到指定文件的stdin和stdout;或>&2，将日志重定向到stderr。（因此，当我们运行awk时，它的stdin连接到.log文件，其stdout连接到.csv文件。

如何迭代.log文件，通过awk处理它们，并用不同的扩展名替换输出文件？

1 个答案: