AWK将CSV转换为HTML表格

时间:2016-11-02 00:58:05

标签: html linux csv awk

只是在linux中乱搞arround并且习惯了AWK。我如何将CSV格式的文件更改为HTML格式的文件。例如......这是我加载到shell中的信息......

user$ cat table.csv
Ep#,Featured Film,Air date
211,First Spaceship on Venus,12/29/90
310,Fugitive Alien,08/17/91
424,Manos: The Hands of Fate,01/30/93

然后在运行代码后,这就应该输出。

user$ csv2html.awk table.csv
<html><body><table>
<tr>
<th>Ep#</th>
<th>Featured Film</th>
<th>Air date</th>
</tr>
<tr>
<td>211</td>
<td>First Spaceship on Venus</td>
<td>12/29/90</td>
</tr>
<tr>
<td>310</td>
<td>Fugitive Alien</td>
<td>08/17/91</td>
</tr>
<tr>
<td>424</td>
<td>Manos: The Hands of Fate</td>
<td>01/30/93</td>
</tr>
</table></body></html>

我已经尝试了一些这方面的事情,但我有一些complie错误......

#!/bin/awk
print "<tr>
for( i = 1; i <= NF; i++)
     print "<td> "$i" </td"
#print "</tr>"

2 个答案:

答案 0 :(得分:2)

在AWK中有很多方法可以做到这一点,但我首选的方法是下面的代码。我在代码中包含了一些解释作为注释。希望这有帮助!

要在CLI上运行,请将代码保存在文件中,例如“csv_to_html.awk&#39;并执行&#39; table.csv&#39;作为一个论点:

$ chmod +x csv_to_html.awk
$ ./csv_to_html.awk table.csv > table.html

代码:

#!/bin/awk -f

# Set field separator as comma for csv and print the HTML header line
BEGIN {
    FS=",";
    print "<html><body><table>"
}
# Function to print a row with one argument to handle either a 'th' tag or 'td' tag
function printRow(tag) {
    print "<tr>";
    for(i=1; i<=NF; i++) print "<"tag">"$i"</"tag">";
    print "</tr>"
}
# If CSV file line number (NR variable) is 1, call printRow fucntion with 'th' as argument
NR==1 {
    printRow("th")
}
# If CSV file line number (NR variable) is greater than 1, call printRow fucntion with 'td' as argument
NR>1 {
    printRow("td")
}
# Print HTML footer
END {
    print "</table></body></html>"
}

答案 1 :(得分:0)

另一个类似的awk

    BEGIN{header = "<html><body><table>"; print header}
         {c = NR == 1 ? "th" : "td";
          OFS = et(c) bt(c);
          $1 = $1;
          print wrap("tr", wrap(c,$0)) }
      END{gsub("<","</",header); print header }

    function wrap(t, v) { return bt(t) v et(t)}
    function bt(t) {return "<" t ">"}
    function et(t) {return "</" t ">"}

而不是循环元素使用OFS来插入相应的xml标记。