解析文件以获取2组信息

时间:2013-08-14 20:40:22

标签: parsing unix awk

我有一个记录用户输入的日志文件。日志中的每一行都是唯一的,我需要提取2个特定项 - userId和URL。我不能只使用awk < file '{print$1, print$6}'因为每行中的项目并不总是在同一位置。

示例文字:

userId='1' managed:no address:123street phone:1234567890 http:/someurl.com
newuser:yes userId='2' managed:yes address:123street  http:/someurl.com
userId='3' address:123 street phone:1234567890 http:/someurl.com
userId='4' managed:no address:123street phone:1234567890 http:/someurl.com

我需要将userId和URL地址解析为文件,但这些文件并不总是在每行中的相同位置。任何建议都将不胜感激。

3 个答案:

答案 0 :(得分:2)

$ awk '{for(i=1;$i!~/userId/;i++); print $i, $NF}' file
userId='1' http:/someurl.com
userId='2' http:/someurl.com
userId='3' http:/someurl.com
userId='4' http:/someurl.com

答案 1 :(得分:1)

尝试以下代码:

gawk '{
    for (i=1; i<=NF; i++)
        if ($i ~ "^userId=") id=gensub(/userId=\047([0-9]+)\047/, "\\1", "", $i)
        else if ($i ~ "^http") url=$i
        print "In line "NR", the id is "id" and the url is "url
}' file.txt

示例输入:

userId='1' managed:no address:123street phone:1234567890 http:/someurl1.com
newuser:yes userId='2' managed:yes address:123street  http:/someurl2.com
userId='3' address:123 street phone:1234567890 http:/someurl3.com
userId='4' managed:no address:123street phone:1234567890 http:/someurl4.com

示例输出:

In line 1, the id is 1 and the url is http:/someurl1.com
In line 2, the id is 2 and the url is http:/someurl2.com
In line 3, the id is 3 and the url is http:/someurl3.com
In line 4, the id is 4 and the url is http:/someurl4.com

此解决方案的优点是可以将ID或http项目放在您想要的任何位置。

答案 2 :(得分:0)

使用awk

awk '{for(c=1;c<NF;c++){if(match($c,/userId/)){print $c,$NF; break}}}' your.file

输出:

userId='1' http:/someurl.com
userId='2' http:/someurl.com
userId='3' http:/someurl.com
userId='4' http:/someurl.com