查找行中包含的数字范围

时间:2015-02-09 23:03:54

标签: bash awk

在这个awk脚本中,如果字段$ 5包含1900或更高的年份,我想打印字段$ 2。

22,Grover Cleveland,http://en.wikipedia.org/wiki/Grover_Cleveland,4/03/1885,4/03/1889,Democratic ,Grover_Cleveland_2.jpg,thmb_Grover_Cleveland_2.jpg,New York
23,Benjamin Harrison,http://en.wikipedia.org/wiki/Benjamin_Harrison,4/03/1889,4/03/1893,Republican ,BenjaminHarrison.gif,thmb_BenjaminHarrison.gif,Indiana
24,Grover Cleveland (2nd term),http://en.wikipedia.org/wiki/Grover_Cleveland,4/03/1893,4/03/1897,Democratic ,Grover_Cleveland.jpg,thmb_Grover_Cleveland.jpg,New York
25,William McKinley,http://en.wikipedia.org/wiki/William_McKinley,4/03/1897,14/9/1901,Republican ,WilliamMcKinley.gif,thmb_WilliamMcKinley.gif,Ohio
26,Theodore Roosevelt,http://en.wikipedia.org/wiki/Theodore_Roosevelt,14/9/1901,4/3/1909,Republican ,TheodoreRoosevelt.jpg,thmb_TheodoreRoosevelt.jpg,New York
27,William Howard Taft,http://en.wikipedia.org/wiki/William_Howard_Taft,4/3/1909,4/03/1913,Republican ,WilliamHowardTaft.jpg,thmb_WilliamHowardTaft.jpg,Ohio
28,Woodrow Wilson,http://en.wikipedia.org/wiki/Woodrow_Wilson,4/03/1913,4/03/1921,Democratic ,WoodrowWilson.gif,thmb_WoodrowWilson.gif,New Jersey
29,Warren G. Harding,http://en.wikipedia.org/wiki/Warren_G._Harding,4/03/1921,2/8/1923,Republican ,WarrenGHarding.gif,thmb_WarrenGHarding.gif,Ohio
30,Calvin Coolidge,http://en.wikipedia.org/wiki/Calvin_Coolidge,2/8/1923,4/03/1929,Republican ,CoolidgeWHPortrait.gif,thmb_CoolidgeWHPortrait.gif,Massachusetts
31,Herbert Hoover,http://en.wikipedia.org/wiki/Herbert_Hoover,4/03/1929,4/03/1933,Republican ,HerbertHover.gif,thmb_HerbertHover.gif,Iowa

这就是我到目前为止所做的事情,但它给了我所有的线条,而不仅仅是那些包含超过1900年的线条。

#!/bin/awk -f

BEGIN{ FS=",";
}{
if($5 >= 1900)
{ print $2;}

3 个答案:

答案 0 :(得分:4)

年份以4/03/1885的形式显示。需要额外的步骤来分割该日期并获得年份:

$ awk -F, '{split($5,mdy,"/")} mdy[3]>=1900{print $2}' file
William McKinley
Theodore Roosevelt
William Howard Taft
Woodrow Wilson
Warren G. Harding
Calvin Coolidge

工作原理:

  • -F,

    使用逗号作为字段分隔符。

  • split($5, mdy, "/")

    拆分/处的第五个字段,并将结果放入数组mdy

  • mdy[3]>=1900{print $2}

    选择大于或等于1900的年份并打印字段2.

答案 1 :(得分:1)

另一种方法是使用awk的匹配和子串函数

awk -v FS="," 'match($5,/[0-9]{4}/){name=substr($5,RSTART,RLENGTH)};{if(name>=1900){print $0}}'

结果

25,William McKinley,http://en.wikipedia.org/wiki/William_McKinley,4/03/1897,14/9/1901,Republican ,WilliamMcKinley.gif,thmb_WilliamMcKinley.gif,Ohio
26,Theodore Roosevelt,http://en.wikipedia.org/wiki/Theodore_Roosevelt,14/9/1901,4/3/1909,Republican ,TheodoreRoosevelt.jpg,thmb_TheodoreRoosevelt.jpg,New York
27,William Howard Taft,http://en.wikipedia.org/wiki/William_Howard_Taft,4/3/1909,4/03/1913,Republican ,WilliamHowardTaft.jpg,thmb_WilliamHowardTaft.jpg,Ohio
28,Woodrow Wilson,http://en.wikipedia.org/wiki/Woodrow_Wilson,4/03/1913,4/03/1921,Democratic ,WoodrowWilson.gif,thmb_WoodrowWilson.gif,New Jersey
29,Warren G. Harding,http://en.wikipedia.org/wiki/Warren_G._Harding,4/03/1921,2/8/1923,Republican ,WarrenGHarding.gif,thmb_WarrenGHarding.gif,Ohio
30,Calvin Coolidge,http://en.wikipedia.org/wiki/Calvin_Coolidge,2/8/1923,4/03/1929,Republican ,CoolidgeWHPortrait.gif,thmb_CoolidgeWHPortrait.gif,Massachusetts
31,Herbert Hoover,http://en.wikipedia.org/wiki/Herbert_Hoover,4/03/1929,4/03/1933,Republican ,HerbertHover.gif,thmb_HerbertHover.gif,Iowa

答案 2 :(得分:0)

你搞砸了括号:

#!/bin/awk -f

BEGIN{ FS=",";
}{
if($5 >= 1900) print $2;}