快速正则表达式grep / perl / php

时间:2015-03-24 08:42:32

标签: php regex perl

我有这样的文字:

2015-03-11 10:15 - anonymous logged in to [127.0.0.1] on account 1DEYKqtPtAEt5hDfiAlz7SdFEGUSPguxGu using key uembzzQdgFHq9k0UJfEi4Dnkvc7n3N5tWVNRQmKfZpeJPnyzKVzKSVVsvLGL6bY 0.2379845 BTC were transferred to address 1cRa5v0Nxu9ABkkzlTv4dzsyRf0hOkg27N using key 1zM9nBd1PNu1FF7qKr1t9Y4m0TawPa3ZQJ1LrlvtViCiB2aFjgn8BIHWG2VHjJvV

我尝试提取伪BTC帐户和匹配密钥。文本不像apache日志那样是常规的,因此您可以拥有一个包含5或10个地址的长行或具有相同信息的单行。现在我如何匹配伪BTC帐户及其匹配密钥,来自"登录" "转移"部分,即使一行中有多次出现?

我试过这个:

function get_string_between($string, $start, $end){
$string = " ".$string;
$ini = strpos($string,$start);
if ($ini == 0) return "";
$ini += strlen($start);
$len = strpos($string,$end,$ini) - $ini;
return substr($string,$ini,$len);
}
$parsed = get_string_between($fullstring, "account", "using");

它获取地址但不是匹配的密钥。我想用grep,perl或php解析文本。无论工作更轻松。我一直在尝试用PHP,因为我至少熟悉这种语言。

任何想法?感谢你的帮助

1 个答案:

答案 0 :(得分:0)

您需要在preg_match_all函数中使用基于捕获组的正则表达式。

$str = "2015-03-11 10:15 - anonymous logged in to [127.0.0.1] on account 1DEYKqtPtAEt5hDfiAlz7SdFEGUSPguxGu using key uembzzQdgFHq9k0UJfEi4Dnkvc7n3N5tWVNRQmKfZpeJPnyzKVzKSVVsvLGL6bY 0.2379845 BTC were transferred to address 1cRa5v0Nxu9ABkkzlTv4dzsyRf0hOkg27N using key 1zM9nBd1PNu1FF7qKr1t9Y4m0TawPa3ZQJ1LrlvtViCiB2aFjgn8BIHWG2VHjJvV";
preg_match_all('~(?:account|address)\s+(\S+)\s+using\s+key\s+(\S+)~m', $str, $match);
print_r($match[1]);
print_r($match[2]);

输出:

Array
(
    [0] => 1DEYKqtPtAEt5hDfiAlz7SdFEGUSPguxGu
    [1] => 1cRa5v0Nxu9ABkkzlTv4dzsyRf0hOkg27N
)
Array
(
    [0] => uembzzQdgFHq9k0UJfEi4Dnkvc7n3N5tWVNRQmKfZpeJPnyzKVzKSVVsvLGL6bY
    [1] => 1zM9nBd1PNu1FF7qKr1t9Y4m0TawPa3ZQJ1LrlvtViCiB2aFjgn8BIHWG2VHjJvV
)