在文件中查找文本并获取所需内容

时间:2011-12-26 03:09:05

标签: linux grep

我有很多access_log文件。这是他们的文件中的一行。

access_log.20111215:111.222.333.13 - - [15/Dec/2011:05:25:00 +0900] "GET /index.php?uid=01O9m5s23O0p&p=nutty&a=check_promotion&guid=ON HTTP/1.1" 302 - "http://xxx.com/index.php?uid=xxx&p=mypage&a=index&sid=&fid=&rand=1681" "Something/2.0 qqq(xxx;yyy;zzz)" "-" "-" 0

如何从出现“p = nutty& a = check_promotion”的行中提取uid“01O9m5s23O0p”并输出到新文件。

例如,“output.txt”文件应为:

01O9m5s23O0p
01O9m5s0999p
01O9m5s3249p
fFDSFewrew23
SOMETHINGzzz
...

我试过了:

grep "p=nutty&a=check_promotion" access* > using_grep.out

fgrep -o "p=nutty&a=check_promotion" access* > using_fgrep.out

但它打印整行。我只是想得到你的。

要点:

1) Find the lines which have "p=nutty&a=check_promotion"

2) Extract uid from those lines.

3) Print them to a file.

2 个答案:

答案 0 :(得分:2)

在三个阶段完成这一步:

(格式化以避免滚动)

grep 'p=nutty&a=check_promotion' access* \
| grep -o '[[:alnum:]]\{4\}m5s[[:alnum:]]\{4\}p' \
> output.txt

答案 1 :(得分:2)

如果您的p=nutty&a=check_promotion行在性质上相似,那么我们可以设置分隔符并使用awk提取uid并将它们放在文件中。

awk -v FS="[?&=]" '
$0~/p=nutty&a=check_promotion/{ print $3 > "output_file"}' input_file

<强>测试

[jaypal:~/Temp] cat file
access_log.20111215:210.136.161.13 - - [15/Dec/2011:05:25:00 +0900] "GET /index.php?uid=01O9m5s23O0p&p=nutty&a=check_promotion&guid=ON HTTP/1.1" 302 - "http://xxx.com/index.php?uid=xxx&p=mypage&a=index&sid=&fid=&rand=1681" "Something/2.0 qqq(xxx;yyy;zzz)" "-" "-" 0 
[jaypal:~/Temp] awk -v FS="[?&=]" '
$0~/p=nutty&a=check_promotion/{ print $3 > "output_file"}' input_file
[jaypal:~/Temp] cat output_file 
01O9m5s23O0p