从下面的文件中,我只想选择那些对给定股票有最新LAST_UPDATE
时间的行。
所以,这里我们有3行Stock TCS,所以我想只打印那个LAST_UPDATE
值最高的那一行。
非常感谢任何帮助。
输入文件:
LAST_UPDATE,Stock,YOUR_PRICE,MY_PRICE
04:19:44.314,INFY,146.766,146.7669
05:00:07.405,TCS,2452.21,2453.8296
06:05:25.306,TATA,0,1320.0611
06:05:27.184,TATA,0,1320.0611
07:00:04.426,TCS,2463.8,2463.8037
07:00:08.022,TCS,2463.8,2463.8037
预期输出:
LAST_UPDATE,Stock,YOUR_PRICE,MY_PRICE
04:19:44.314,INFY,146.766,146.7669
06:05:27.184,TATA,0,1320.0611
07:00:08.022,TCS,2463.8,2463.8037
答案 0 :(得分:0)
你走了:
脚本:
#!/bin/ksh
tempfile="stocktemp"
mkdir $tempfile
# sort the time by stock in a temp file named by the stock name
while read line; do
stock=`echo $line | cut -d "," -f 2`
echo $line >> "$tempfile/$stock.txt"
done < inputfile
# Remove the line generated because of the top line in inputfile
rm $tempfile/Stock.txt
# in all the stock file ...
for file in $tempfile/*; do
# (Init a comparitor)
time="00:00:00"
# ... we compare the time between the lines
while read line; do
# we select the time in the line where we removed the .xyz at the end (we don't need ms)
comp=`echo $line | cut -d "," -f 1 | cut -d "." -f 1`
# we compare the time converted in second
if [ `echo $comp | sed s/:/*60+/g | bc` -gt `echo $time | sed s/:/*60+/g` ]; then
time=$comp
final=$line
fi
done < $file
echo $final
done
rm -rf $tempfile
输入文件:
LAST_UPDATE,Stock,YOUR_PRICE,MY_PRICE
04:19:44.314,INFY,146.766,146.7669
05:00:07.405,TCS,2452.21,2453.8296
06:05:25.306,TATA,0,1320.0611
06:05:27.184,TATA,0,1320.0611
07:00:04.426,TCS,2463.8,2463.8037
07:00:08.022,TCS,2463.8,2463.8037
测试:
Will /home/will # ./script.ksh
04:19:44.314,INFY,146.766,146.7669
06:05:27.184,TATA,0,1320.0611
07:00:08.022,TCS,2463.8,2463.8037
不是最干净但是有效。如果您想在文件中显示结果,可以按echo $final
echo $final >> output.txt
答案 1 :(得分:0)
假设:
awk
解决方案一种可能的awk
解决方案:
$ cat find_last.awk
$2=="Stock" { print ; next } # print "Stock" line when we find it; skip "NF==4" processing by going to next line in file
NF==4 { lastline[$2]=$0 } # if field count (NF) = 4 then store latest line for $2=symbol in associative array;
# has added benefit that it ignores blank lines
END { n = asorti(lastline, x) # sort our array indices (aka symbol names); 'n' = count of indices; x[] array of indices
for ( i=1 ; i<=n; i++ ) { # loop through our list of n array indices (aka symbol names)
print lastline[x[i]] # print the (last/greatest) line for a stock/symbol
}
}
END { ... }
:处理完输入文件后执行(一次)我们的示例输入文件(包括原始问题中的空白行):
$ cat infile
LAST_UPDATE,Stock,YOUR_PRICE,MY_PRICE
04:19:44.314,INFY,146.766,146.7669
05:00:07.405,TCS,2452.21,2453.8296
06:05:25.306,TATA,0,1320.0611
06:05:27.184,TATA,0,1320.0611
07:00:04.426,TCS,2463.8,2463.8037
07:00:08.022,TCS,2463.8,2463.8037
行动中的awk
脚本:
$ sort infile | awk -F, -f find_last.awk
LAST_UPDATE,Stock,YOUR_PRICE,MY_PRICE
04:19:44.314,INFY,146.766,146.7669
06:05:27.184,TATA,0,1320.0611
07:00:08.022,TCS,2463.8,2463.8037
sort infile | awk ...
:按时间戳排序输入文件,管道输出到awk
命令-F,
:将输入字段分隔符设置为逗号(,)-f find_last.awk
:使用名为awk
find_last.awk
个命令