将行输出解析为bash中的表

时间:2014-08-25 10:18:28

标签: bash awk sed

我的文件只包含

格式的行
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003

是否可以使用bash将此输出解析为类似

的形式
7,59,0.876,0.000433344,0.00003

然后读到python?

8 个答案:

答案 0 :(得分:3)

sed 's/[^0-9,;.]//g;y/;/,/' YourFile
  1. 删除所有非数字,和。;
  2. 改变;到,

答案 1 :(得分:1)

如果内容采用您提到的格式,您可以尝试使用以下sed命令

$ sed 's/^[^(]*(\([^)]*\))\s*;\s*\S*\s*=\s*\(\S\+\)\s*;\s*\S*\s*=\s*\(\S\+\)\s*;\s*\S*\s*=\s*\(\S\+\)$/\1,\2,\3,\4/' file
7,59,0.876,0.000433344,0.00003

答案 2 :(得分:1)

使用sed:

sed 's/[^0-9,.][^0-9,.]*/ /g' input

更好的格式化:

 sed 's/[^0-9,.][^0-9,.]*/ /g' input | column -to,

给出:

7,59,0.876,0.000433344,0.00003

答案 3 :(得分:0)

您可以grep获取数字:

$ grep -o '[0-9.]*' file
7
59
0.876
0.000433344
0.00003

使用-o标记,我们指示grep只是为了打印匹配的结果。这样,您拥有所有值,但不包含周围文本。

如果您希望以逗号分隔,请通过管道tr以逗号替换每个新行,最后到sed以使用新行替换最后一个逗号:

$ grep -o '[0-9.]*' a | tr -s '\n' ',' | sed 's/,$/\n/'
7,59,0.876,0.000433344,0.00003

答案 4 :(得分:0)

使用gnu awk:

cat file

new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003
new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003

awk -F ' *[=()] *' -v RS=' ; |\n' -v OFS= -v ORS= 'NF{print $2, (NR%4==0)? "\n":","}' file
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003
7,59,0.876,0.000433344,0.00003

答案 5 :(得分:0)

还使用FPAT gnu awk:

awk -v FPAT="[0-9.]+" '{for(i=1;i<=NF;i++)printf "%s%s", $i,(i!=NF?",":"\n")}'

试验:

$ echo "new file (7,59) ; lim = 0.876 ; dim = 0.000433344 ; r_d = 0.00003"|awk -v FPAT="[0-9.]+" '{for(i=1;i<=NF;i++)printf "%s%s", $i,(i!=NF?",":"\n")}'      
7,59,0.876,0.000433344,0.00003

FPAT可以做得更好。

答案 6 :(得分:0)

许多解决方案,只有perl misisng;)

perl -nlE '$,=",";say m/[\d.]+/g'
  • 将“列表分隔符”设置为,
  • 仅匹配数字(返回列表)
  • 打印列表

或(ofc)@ neronlevelu的解决方案

perl -plE 's/[^\d,;.]//g;y/;/,/'
  • 删除任何不属于digit,;.
  • 的内容
  • ;更改为','(y 会在搜索列表中找到所有出现的字符,其中包含替换列表中的相应字符) - aka { {1}}。

答案 7 :(得分:0)

$ sed -r 's/[^0-9.]+/,/g;s/^,//' file
7,59,0.876,0.000433344,0.00003

$ awk -F'[^0-9.]+' -v OFS=',' '{$1=$1;sub(/^,/,"")} 1' file
7,59,0.876,0.000433344,0.00003

$ sed -r 's/[^0-9.,;]+//g;s/;/,/g' file
7,59,0.876,0.000433344,0.00003

$ awk -F';' -v OFS=',' '{$1=$1;gsub(/[^0-9.,]/,"")} 1' file
7,59,0.876,0.000433344,0.00003

我个人更喜欢最后两个,因为他们没有添加逗号然后再删除它,这总是让人觉得有点笨拙且容易出错。