AWK - 更改第二个日期

时间:2012-11-03 10:41:22

标签: awk gsub

我有一个文件:

pablo tty8 Thu Nov 1 12:51:21 2012 still logged in 
(unknown tty8 Thu Nov 1 12:50:57 2012 - Thu Nov 1 12:51:21 2012 (00:00) 
pablo tty2 Thu Nov 1 12:50:39 2012 still logged in 
pablo tty7 Thu Nov 1 12:49:45 2012 - Thu Nov 1 12:50:56 2012 (00:01) 
(unknown tty7 Thu Nov 1 12:34:32 2012 - Thu Nov 1 12:49:45 2012 (00:15)

我想在上面的日期替换文件一秒钟。我想打印:

pablo tty8 1351770681 still logged in 
(unknown tty8 1351770657 - 1351770681 (00:00) 
pablo tty2 1351770639 still logged in 
pablo tty7 1351770585 - 1351770656 (00:01) 
(unknown tty7 1351769672 - 1351770585 (00:15)

我试过这个命令:

gawk --posix 'function my()
{"date -d \047"$0"\047 +%s" | getline b; 
gsub( /[A-Za-z]{3} [A-Za-z]{3} [0-9] ([0-9]{2}:){2}[0-9]{2} [0-9]{4}/,b ); print}
{ my() }' file

以上命令不起作用:

$ gawk --posix 'function my()
> {"date -d \047"$0"\047 +%s" | getline b; 
> gsub( /[A-Za-z]{3} [A-Za-z]{3} [0-9] ([0-9]{2}:){2}[0-9]{2} [0-9]{4}/,b ); print}
> { my() }' ta
date: błędna data: `pablo tty8 Thu Nov 1 12:51:21 2012 still logged in '
pablo tty8  still logged in 
(unknown tty8 1351897200 - 1351897200 (00:00) 
date: błędna data: `pablo tty2 Thu Nov 1 12:50:39 2012 still logged in '
pablo tty2 1351897200 still logged in 
date: błędna data: `pablo tty7 Thu Nov 1 12:49:45 2012 - Thu Nov 1 12:50:56 2012 (00:01) '
pablo tty7 1351897200 - 1351897200 (00:01) 
(unknown tty7 1351897200 - 1351897200 (00:15)

如何改进上述命令?

感谢您的帮助。

3 个答案:

答案 0 :(得分:5)

如果您安装了vim,请尝试以下命令:

:%s/\v\w+\s\w+\s\d+\s\d+:\d+:\d+\s\d+/\=system('date +%s -d"'.submatch(0).'" | tr -d "\n"')/g

这个想法非常简单。 vim可以非常快。

答案 1 :(得分:2)

这是使用GNU awk的一种方式。像:

一样运行
awk -f script.awk file.txt

script.awk的内容:

{
    line = ($0 ~ /still logged in/) ? "still logged in" : "-" OFS getstamp(10) OFS $NF
    print $1, $2, getstamp(4), line
} 

function getstamp(i) {

    split($(i + 2), T, ":")

    Y = $(i + 3)
    M = convert($i)
    D = $(i + 1)

    hrs = T[1] + 9
    min = T[2]
    sec = T[3]

    return(mktime(sprintf("%d %d %d %d %d %d", Y, M, D, hrs, min, sec)))
}

function convert(month) {

    return(((index("JanFebMarAprMayJunJulAugSepOctNovDec", month) - 1) / 3) + 1)
}

结果:

pablo tty8 1351770681 still logged in
(unknown tty8 1351770657 - 1351770681 (00:00)
pablo tty2 1351770639 still logged in
pablo tty7 1351770585 - 1351770656 (00:01)
(unknown tty7 1351769672 - 1351770585 (00:15)

答案 2 :(得分:1)

以下是使用date awk的解决方案(仅限gawk

awk --posix '
{
  while(match($0,/([[:alpha:]]{3} ){2}[^[:alpha:]]+[0-9]{4}/)){
    date_str=substr($0, RSTART, RLENGTH)
    "date -d \""date_str"\" +%s" | getline date_sec
    sub(date_str,date_sec,$0)
  }
  print
}
' $1

输出:

pablo tty8 1351745481 still logged in 
(unknown tty8 1351745457 - 1351745457 (00:00) 
pablo tty2 1351745439 still logged in 
pablo tty7 1351745385 - 1351745456 (00:01) 
(unknown tty7 1351744472 - 1351744472 (00:15)

注意:

  1. match - substr组合用于提取包含日期的子字符串。
  2. 使用date将日期子字符串转换为秒格式(+%s)并将秒数指定为date_sec
  3. 用第二格式的日期替换字符串格式的日期。
  4. 迭代直到找不到匹配项(match如果找不到匹配,则返回0,终止while循环)
  5. gawk --re-interval--posix选项
  6. 允许使用区间表达式