Question

我在输入文件（比如fileA）中有一个日期，格式为YYYY-MM-DD

e.g. 2016-10-18

现在我有一个时间戳格式的日期列表，如下面的另一个文件（比如在fileB中）

20161017120311
20161017140317
20161018010315
20161018160311
20161019020310
20161019124015

现在我想只选择日期的最大值（来自fileB），它等于fileA的日期。因此，在这种情况下，将从fileB中选择的日期为20161018160311。

还可能发生fileB中没有日期20161018的记录。假设fileB如下所示

20161017120311
20161017140317
20161019020310
20161019124015
20161020010315
20161021160311

如果相同的代码应该选择下一个可用日期的最大值。即下一个可用日期为20161019，20161019的最大值为20161019124015。所以输出应该是20161019124015

Answer 1

尝试使用以下 awk命令，这比Marcel Jacques Machado's answer更有效：

#!/bin/sh

fileA='/path/to/file A'
fileB='/path/to/file B'

awk -v refDate="$(tr -d '-' < "$fileA")" '
  substr($0, 1, length(refDate)) < refDate { next } # skip lines before
  substr($0, 1, length(refDate)) == refDate { lastMatch = $0; next } # save match
  { exit } # we are done once the first greater row is reached 
  END { print (lastMatch == "" ? $0 : lastMatch); exit } # print last match or current row
' "$fileB"

这会创建 2 子进程 - 一个用于涉及tr的命令替换，另一个用于awk - 与估计的高达 15 Marcel解决方案创建的子进程，可能会多次读取输入文件。

Answer 2

$ cat program.awk
NR==FNR {                    # get the date from the first file
    a=$0"000000"             # zeropad the end (a="2016-10-18000000")
    gsub(/-/,"",a)           # remove dashes (a="20161018000000")
    next                 
} 
$0 >= a {                    # we sorted fileB so the next bigger or equal to a is the date
    if(b=="")                # pick whatever is the next date for reference
        b=substr($0,1,8)     # just the date part
    d=substr($0,1,8)         # get this records date part
    if (d>b) {               # if this date is bigger than the reference...
        print c; exit        # output and exit
    } 
    c=$0                     # the latest timestamp on this date
}

运行它：

$ awk -f program.awk fileA <(sort -n fileB)
20161019124015

Answer 3

maxDate.sh：

fileA=$1
fileB=$2
date=`sed s/-//g $fileA`
max=`grep $date $fileB | sort | tail -1`
if [[ $max == '' ]];
then
    date=`sed s/-//g $fileA $fileB | sort | grep $date -1 | tail -1 | egrep -o [0-9]{8}`; 
    max=`grep $date $fileB | sort | tail -1`
fi
echo $max

运行：

./maxDate.sh fileA fileB

从列表中查找大于或等于另一个值的最大值

3 个答案: