如何从bash中的文件中匹配字符串的行中提取IP地址?

时间:2017-12-14 13:39:18

标签: bash string-parsing

示例:

我有一个文件名example.txt,里面有这个文本:

some text here INFO    200     cv 58687 http://saomesitehoere.com live connect ASDFG 61.215.80.6 07:16

some text here INFO    100     fv 582702687 http://saomesitehoere.org live connect 31.15.80.1 07:16:33

some text here INFO    00     ov 587 http://saomesitehoere.uk live connect ASGGGGFG 91.211.80.6 09:16

some text here INFO    800    kcv 277 http://saomesitehoere.za live connect AFG 71.215.81.5 09:14

我想从包含字符串名称“ASDFG”的行中提取IP地址,这意味着61.215.80.6

任何人都可以提供帮助吗?

5 个答案:

答案 0 :(得分:3)

$ grep -oP 'ASDFG \K\S*' < file
61.215.80.6

答案 1 :(得分:1)

你可以试试awk:

awk '/\<ASDFG\>/{print $(NF-1)}' file

答案 2 :(得分:1)

您可以使用:

grep -oE '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'

答案 3 :(得分:0)

IP地址类似于三位一到三位数后跟一个点,后跟一到三位数字。所以这将为您提供这样一行的IP地址:

$ grep ASDFG logfile | grep -o '\([[:digit:]]\{1,3\}\.\)\{3\}[[:digit:]]\+'
61.215.80.6

答案 4 :(得分:0)

要从文件中提取所有“有效” IP地址:

DROP TABLE IF EXISTS  junk.testcascade ;
CREATE TABLE junk.testcascade (
startcol INT
)
partitioned by (d int)
stored as parquet
;
INSERT INTO TABLE junk.testcascade PARTITION(d=1)
VALUES
    (1),
    (2)
;

INSERT INTO TABLE junk.testcascade PARTITION(d=2)
VALUES
    (1),
    (2)
;

SELECT * FROM junk.testcascade ;
+-----------------------+----------------+--+
| testcascade.startcol  | testcascade.d  |
+-----------------------+----------------+--+
| 1                     | 1              |
| 2                     | 1              |
| 1                     | 2              |
| 2                     | 2              |
+-----------------------+----------------+--+

 --no cascade! opps
ALTER TABLE junk.testcascade ADD COLUMNS( testcol1 int, testcol2 int) ;

INSERT OVERWRITE TABLE junk.testcascade PARTITION(d=3)
VALUES
    (1,1,1),
    (2,1,1)
;

INSERT OVERWRITE TABLE junk.testcascade PARTITION(d=2)
VALUES
    (1,1,1),
    (2,1,1)
;

--okay! because we created this table after altering the metadata
select * FROM junk.testcascade where d=3;
+-----------------------+-----------------------+-----------------------+----------------+--+
| testcascade.startcol  | testcascade.testcol1  | testcascade.testcol2  | testcascade.d  |
+-----------------------+-----------------------+-----------------------+----------------+--+
| 1                     | 1                     | 1                     | 3              |
| 2                     | 1                     | 1                     | 3              |
+-----------------------+-----------------------+-----------------------+----------------+--+

--not okay even tho we inserted =( because the metadata isnt changed
select * FROM junk.testcascade where d=2;
+-----------------------+-----------------------+-----------------------+----------------+--+
| testcascade.startcol  | testcascade.testcol1  | testcascade.testcol2  | testcascade.d  |
+-----------------------+-----------------------+-----------------------+----------------+--+
| 1                     | NULL                  | NULL                  | 2              |
| 2                     | NULL                  | NULL                  | 2              |
+-----------------------+-----------------------+-----------------------+----------------+--+

--cut back to original columns
ALTER TABLE junk.testcascade REPLACE COLUMNS( startcol int) CASCADE;

--add
ALTER table junk.testcascade ADD COLUMNS( testcol1 int, testcol2 int) CASCADE;

--it works!
select * FROM junk.testcascade where d=2; 
+-----------------------+-----------------------+-----------------------+----------------+--+
| testcascade.startcol  | testcascade.testcol1  | testcascade.testcol2  | testcascade.d  |
+-----------------------+-----------------------+-----------------------+----------------+--+
| 1                     | 1                     | 1                     | 2              |
| 2                     | 1                     | 1                     | 2              |
+-----------------------+-----------------------+-----------------------+----------------+--+

这将分两个步骤进行:

  1. 首先,它将提取所有具有模式行digits.digits.digits.digits的子字符串
  2. 它将检查每组数字是否小于或等于255。

注意:需要gawk '{match($0,/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/,a);split(a[0],b,".")} b[1]<=255&& b[2]<=255 && b[3]<=255 && b[4]<=255 &&length(a[0]){print a[0]}' input_file 才能使用gawk功能。