Question

从下面的文件中，我只想选择那些对给定股票有最新LAST_UPDATE时间的行。

所以，这里我们有3行Stock TCS，所以我想只打印那个LAST_UPDATE值最高的那一行。

非常感谢任何帮助。

输入文件：

LAST_UPDATE,Stock,YOUR_PRICE,MY_PRICE 

04:19:44.314,INFY,146.766,146.7669

05:00:07.405,TCS,2452.21,2453.8296

06:05:25.306,TATA,0,1320.0611

06:05:27.184,TATA,0,1320.0611

07:00:04.426,TCS,2463.8,2463.8037

07:00:08.022,TCS,2463.8,2463.8037

预期输出：

LAST_UPDATE,Stock,YOUR_PRICE,MY_PRICE

04:19:44.314,INFY,146.766,146.7669

06:05:27.184,TATA,0,1320.0611

07:00:08.022,TCS,2463.8,2463.8037

Answer 1

你走了：

脚本：

#!/bin/ksh
tempfile="stocktemp"
mkdir $tempfile
# sort the time by stock in a temp file named by the stock name
while read line; do
        stock=`echo $line | cut -d "," -f 2`
        echo $line >> "$tempfile/$stock.txt"
done < inputfile
# Remove the line generated because of the top line in inputfile
rm $tempfile/Stock.txt
# in all the stock file ...
for file in $tempfile/*; do
        # (Init a comparitor)
        time="00:00:00"
        # ... we compare the time between the lines
        while read line; do
                # we select the time in the line where we removed the .xyz at the end (we don't need ms)
                comp=`echo $line | cut -d "," -f 1 | cut -d "." -f 1`
                # we compare the time converted in second
                if [ `echo $comp | sed s/:/*60+/g | bc` -gt `echo $time | sed s/:/*60+/g` ]; then
                time=$comp
                final=$line
                fi
        done < $file
        echo $final
done
rm -rf $tempfile

输入文件：

LAST_UPDATE,Stock,YOUR_PRICE,MY_PRICE 

04:19:44.314,INFY,146.766,146.7669

05:00:07.405,TCS,2452.21,2453.8296

06:05:25.306,TATA,0,1320.0611

06:05:27.184,TATA,0,1320.0611

07:00:04.426,TCS,2463.8,2463.8037

07:00:08.022,TCS,2463.8,2463.8037

测试：

Will /home/will # ./script.ksh
04:19:44.314,INFY,146.766,146.7669
06:05:27.184,TATA,0,1320.0611
07:00:08.022,TCS,2463.8,2463.8037

不是最干净但是有效。如果您想在文件中显示结果，可以按echo $final

更改echo $final >> output.txt

Answer 2

假设：

可以接受awk解决方案
输入文件可能（已经）没有按时间戳排序
输出按字母顺序按库存/符号名称排序（特殊情况：＆＃39; Stock＆＃39;行始终先打印）
输出中将跳过/忽略空白行（否则可以编辑解决方案以在输出行之间添加空行）

一种可能的awk解决方案：

$ cat find_last.awk
$2=="Stock" { print ; next }            # print "Stock" line when we find it; skip "NF==4" processing by going to next line in file

NF==4       { lastline[$2]=$0 }         # if field count (NF) = 4 then store latest line for $2=symbol in associative array;
                                        # has added benefit that it ignores blank lines

END { n = asorti(lastline, x)           # sort our array indices (aka symbol names); 'n' = count of indices; x[] array of indices

      for ( i=1 ; i<=n; i++ ) {         # loop through our list of n array indices (aka symbol names)

          print lastline[x[i]]          # print the (last/greatest) line for a stock/symbol
      }
    }

END { ... }：处理完输入文件后执行（一次）

我们的示例输入文件（包括原始问题中的空白行）：

$ cat infile
LAST_UPDATE,Stock,YOUR_PRICE,MY_PRICE

04:19:44.314,INFY,146.766,146.7669

05:00:07.405,TCS,2452.21,2453.8296

06:05:25.306,TATA,0,1320.0611

06:05:27.184,TATA,0,1320.0611

07:00:04.426,TCS,2463.8,2463.8037

07:00:08.022,TCS,2463.8,2463.8037

行动中的awk脚本：

$ sort infile | awk -F, -f find_last.awk
LAST_UPDATE,Stock,YOUR_PRICE,MY_PRICE
04:19:44.314,INFY,146.766,146.7669
06:05:27.184,TATA,0,1320.0611
07:00:08.022,TCS,2463.8,2463.8037

sort infile | awk ...：按时间戳排序输入文件，管道输出到awk命令
-F,：将输入字段分隔符设置为逗号（，）
-f find_last.awk：使用名为awk

find_last.awk

在Shell中分组以获取文件中的最大日期

2 个答案: