如何在unix bash中使用正则表达式在文本中查找日期和ipaddress

时间:2014-08-27 05:27:40

标签: regex linux bash

我必须在Unix bash中通过regex在文本文件中找出日期和ipaddresses。下面是我的脚本,但它无法解析

#!/bin/bash
# script to parse Record1.txt such that tokenize with coma, that find, date with#time, ipaddress,url in this.
OIFS="$IFS"
IFS=$','
arr=($(sed 's_{,}_\s_g' Record1.txt ))
IFS="$OIFS"
pattern_date=""
pattern_ipaddress=""
for i in "${arr[@]}"
do
        pattern_date=$(echo $i|grep '\d{4}[-/]\d{2}[-/]\d{2}\s\d{2}:\d{2}:\d{2}')
        pattern_ipaddress=$(echo $i|grep '\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}')
        echo "pattern_date =$pattern_date"
        echo "pattern_ipaddress=$pattern_ipaddress"
        if [  $i = '$pattern_date'  ]
        then
                echo $i
        elif [ $i = '$pattern_ipaddress' ]
        then
                echo $i
        fi
#check for ipaddress: ipaddress patern will be \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3} 




done

数据样本如下

2014/08/06 05:15:00,00.28.185.69,170188,13609FAD,tsp08.sprintpcs.com,0123456789,Framed-User,10.152.181.33,000007632306609,102400,102400,300,Unknown,2014/08/06 05:15:00,,2014/08/06 05:10:00,2014/08/06 05:15:00,0123456789,120E00644D93,Stop,Mobile,1xRSS~#~pdc001.abc.net

任何机构都可以告诉我哪里有问题

3 个答案:

答案 0 :(得分:1)

只需使用sed而不是那些循环 sed 's|^\([^ ]*\) \([^,]*,\)\{7\}\([^,]*\),.*|\1 \3|' Record1.txt

答案 1 :(得分:1)

也许你应该使用Perl regexp。

grep -P '\d{4}[-/]\d{2}[-/]\d{2}\s\d{2}:\d{2}:\d{2}'

答案 2 :(得分:0)

使用cut可能有帮助

data="2014/08/06 05:15:00,00.28.185.69,170188,13609FAD,tsp08.sprintpcs.com,0123456789,Framed-User,10.152.181.33,000007632306609,102400,102400,300,Unknown,2014/08/06 05:15:00,,2014/08/06 05:10:00,2014/08/06 05:15:00,0123456789,120E00644D93,Stop,Mobile,1xRSS~#~pdc001.abc.net"
echo "$data" | cut -d ',' -f1,8
2014/08/06 05:15:00,10.152.181.33

然后你可以用

提取地址
reduce="$(echo "$data" | cut -d ',' -f1,8 )"
mydate="$(echo "$reduce" | cut -d ' ' -f1 )"
myip="$(echo "$reduce" | cut -d ',' -f2 )"