我有这段文字:
156.48.459.20 - - [11/Aug/2019
156.48.459.20 - - [11/Aug/2019
235.145.41.12 - - [11/Aug/2019
235.145.41.12 - - [11/Aug/2019
66.23.114.251 - - [11/Aug/2019
我想匹配当天的所有行,所以我做了这个简单的正则表达式'/.*11\/Aug\/2019.*'
。
如您所见,文本中有两个重复的IP,我不想匹配重复的行,所以我进行了搜索,发现了这个正则表达式:(.).*\1
DEMO尽管这个正则表达式我尝试在当前的正则表达式中应用它有点奇怪,所以我这样做了:(.*11\/Aug\/2019.*)\1
,它没有用。有人可以帮忙吗?
这是我想要的结果:
156.48.459.20 - - [11/Aug/2019
235.145.41.12 - - [11/Aug/2019
66.23.114.251 - - [11/Aug/2019
注意:我正在使用函数preg_match_all()
:
preg_match_all('/(.*11\/Aug\/2019.*)\1/', $input_lines, $output_array);
答案 0 :(得分:4)
需要纯正则表达式吗?
您可以使用PHP获得唯一性:
<?php
$input_lines = '156.48.459.20 - - [11/Aug/2019
156.48.459.20 - - [11/Aug/2019
235.145.41.12 - - [11/Aug/2019
235.145.41.12 - - [11/Aug/2019
66.23.114.251 - - [11/Aug/2019';
preg_match_all( '/.*11\/Aug\/2019/m', $input_lines, $output_array );
// PHP associative array abuse incoming
// Flip the array so that the values become keys and flip it back
// This guarantees that only uniques survive
$output_array[ 0 ] = array_keys( array_flip( $output_array[ 0 ] ) );
var_dump( $output_array );
输出:
array(1) {
[0]=>
array(3) {
[1]=>
string(30) "156.48.459.20 - - [11/Aug/2019"
[3]=>
string(30) "235.145.41.12 - - [11/Aug/2019"
[4]=>
string(30) "66.23.114.251 - - [11/Aug/2019"
}
}
答案 1 :(得分:2)
几乎是1班轮
'~(?m)^(?:([\d.]*[- ]*\[11/Aug/2019.*)\R*(?=[\S\s]*?\1)|(?!.*\[11/Aug/2019).*\R*)~'
Php
$target = <<<'EOS'
156.48.459.20 - - [11/Aug/2019
156.48.459.20 - - [11/Aug/2019
235.145.41.12 - - [11/Aug/2019
235.145.41.12 - - [11/Aug/2019
66.23.114.251 - - [11/Aug/2019
66.23.114.251 - - [09/Aug/2019
156.48.459.20 - - [11/Aug/2019
235.145.41.12 - - [11/Aug/2019
66.23.114.251 - - [01/Aug/2019
66.23.114.251 - - [11/Aug/2019
235.145.41.12 - - [11/Aug/2019
EOS;
$res = preg_replace ( '~(?m)^(?:([\d.]*[- ]*\[11/Aug/2019.*)\R*(?=[\S\s]*?\1)|(?!.*\[11/Aug/2019).*\R*)~', '', $target );
echo $res."\n";
输出
156.48.459.20 - - [11/Aug/2019
66.23.114.251 - - [11/Aug/2019
235.145.41.12 - - [11/Aug/2019
更好的视图
(?m)
^
(?:
( [\d.]* [- ]* \[ 11/Aug/2019 .* ) # (1)
\R*
(?= [\S\s]*? \1 )
|
(?! .* \[ 11/Aug/2019 )
.* \R*
)
答案 2 :(得分:0)
$txt = <<<'EOD'
156.48.459.20 - - [11/Aug/2019
156.48.459.20 - - [11/Aug/2019
235.145.41.12 - - [11/Aug/2019
235.145.41.12 - - [11/Aug/2019
66.23.114.251 - - [11/Aug/2019
EOD;
$url = 'data:text/plain;base64,' . base64_encode($txt);
// change this line with the url of your log file: $url = '/path/to/file.log';
$result = [];
if ( false !== $handle = fopen($url, 'r') ) {
while ( false !== $data = fgetcsv($handle, 1000, ' ') ) {
if ( $data[3] === '[11/Aug/2019' )
$result[$data[0]] = 1;
}
}
$result = array_keys($result);
print_r($result);