Question

我正在试图弄清楚如何解析包含原始日志数据（crontab -l的结果）的一组文件，并将此数据转换为CSV文件。文件中的条目如下：

10,25,40,55 * * * * /some/cron/here > /dev/null 2>&1
30 */4 * * * /some/cron/here

等等。

我想让他们在CSV文件中使用这种格式：

Cronjob | # of Servers | Every minute | Every hour | Every day | Every week | Every month
-----------------------------------------------------------------------------------------
CronHere| 10 | N | N | Y | Y | Y
CronHere| 8 | Y | N | N | Y | Y

等等。

有人能举例说明我如何做这件事吗？

Answer 1

您可以使用Perl regexps解析这些文件，使用Text::CSV排列数据并保存输出

Answer 2

这样的事情会让你开始：

#!/usr/bin/env perl
use strict;
use warnings;
while (<>) {
    chomp;
    my @line = split q( ), $_, 6;
    print join q(|), $line[5], @line[0..4], "\n";
}

至于枚举任务发生的服务器数量，您需要更好地定义如何区分任务 - 仅通过名称，或通过完全匹配所有参数。一旦你这样做，你可以使用哈希计数。

Answer 3

感谢ruby，我最终完成了这项任务。

#!/usr/bin/ruby

crons = []
counts = []
cparts = []
basedir = "/logdirectory"
def YorN(part)
  if part == "*"
    "N"
  else
    "Y"
  end
end
Dir.new(basedir).entries.each do |logfile|
  unless File.directory? logfile
    if logfile.split('.')[1] == 'log'
      file = File.new(logfile, "r")
      while (line = file.gets)
        parts = line.split(' ')
        if parts[5,parts.length-5]
          cmd = parts[5,parts.length-5].join(' ')
          idx = crons.index(cmd)
          if idx
             counts[idx] += 1
          else
            crons << cmd
            idx = crons.index(cmd)
            counts[idx] = 1
            cparts[idx] = parts[0,5] # an Array containing another Array !
          end
        else
          puts "Error on: #{line} in file #{logfile}"
        end
      end
      file.close
    end
  end
end
# OUTPUT results
puts "# Servers  Min  Hour  DOM  Month DOW  Cronjob"
crons.each do |c|
  idx = crons.index(c)
  puts "#{counts[idx]} #{YorN(cparts[idx][0])} #{YorN(cparts[idx][1])} #{YorN(cparts[idx][2])} #{YorN(cparts[idx][3])} #{YorN(cparts[idx][4])} #{crons[idx]}"
end

解析文本文件并生成CSV

3 个答案: