分析数值数据

时间:2015-02-04 11:12:01

标签: bash text numeric

使用一些bash脚本我想对大型log.txt文件进行某种分析,其中包含大量字符串,其中每个字符串都以下列格式存在

PHE 233,R PHE 233,0.0,0.0,0.0,-0.07884,0.0296770011962,0.00209848087911,0.023555,0.0757544518494,0.00535664866078,-0.065675,0.0859064571205,0.00607450383776,0.0,0.0,0.0,-0.12096,0.0486756448339,0.00344188785407
TYR 234,R TYR 234,0.0,0.0,0.0,-1.25531,0.629561517169,0.0445167217964,-0.004085,0.179779219531,0.0127123105246,0.169925,0.199097411774,0.0140783129982,-0.06675426,0.0227214659046,0.00160665026196,-1.15622426,0.59309226863,0.0419379565017
GLY 235,R GLY 235,0.0,0.0,0.0,-0.039345,0.0259211491836,0.00183290203639,-0.053115,0.0245550763591,0.00173630610061,0.098535,0.0441429357316,0.00312137691973,0.0,0.0,0.0,0.006075,0.0208364914273,0.00147336243844
THR 236,R THR 236,0.0,0.0,0.0,-0.03241,0.0100624003101,0.000711519149426,-0.115375,0.0590932684407,0.00417852508369,0.116505,0.0563931731241,0.00398759951286,0.0,0.0,0.0,-0.03128,0.0262172004608,0.00185383602295

从log.txt这一行我需要获取并粘贴新的日志文件final_log.txt只有第一,第二和最后一个术语:在上面的例子中它将是

PHE 233 0.00344188785407
TYR 234 0.0419379565017
THR 236 0.00185383602295

!!什么是最重要的!因为典型的日志由新的txt文件中的大量字符串组成,我希望根据为它们提供选择阈值的最后一项的值对字符串进行排序。最后,我从log.txt中选择并粘贴到最后一列中的数字等于或高于定义阈值的那些字符串的final_log.txt。我非常感谢这个非平凡(对我来说)问题的任何解决方案。

格列勃

1 个答案:

答案 0 :(得分:0)

通过awk,

awk -F'[ ,]' '{print $1" "$2" "$NF}' file

OR

$ awk -F'[ ,]' '{print $1,$2,$NF}' file
PHE 233 0.00344188785407
TYR 234 0.0419379565017
GLY 235 0.00147336243844
THR 236 0.00185383602295