Question

我在文件中有以下数据，这些数据会重复多次：

日期：21 月份：03 年份：2017年量：50 类别：杂货店帐号：银行注意：昂贵

现在，我想在“金额：”之后提取值，即“50”。

我在PHP中使用以下代码：

$result = preg_split("/Amount/", $contents);
$truncated = substr($printresult, 1, 2);
print_r($truncated);

我得到的结果是：

    Da50

你能不能帮我弄清楚我在这段代码中究竟出错了什么？

谢谢。

[编辑：$ contents包含所有字符串数据]

这是整个代码：http://paste.ideaslabs.com/show/hwj7IiPUcd data.txt的内容是：http://paste.ideaslabs.com/show/5TxWH8MUX

Answer 1

你可以试试这个

    $subject = "Date:21 Month:03 Year:2017 Amount:50 Category:Grocery Account:bank Note:expensive";

$pattern = "/Account/";

    preg_match($pattern, $subject, $matches);
    print_r($matches);

Answer 2

da来自字符串开头的Date。您需要使用preg_match或preg_match_all来提取完全匹配。 preg_split分裂找到的术语，你不关心的索引0。尝试：

$arraynext = 'Date:21
Month:03
Year:2017
Amount:50
Category:Wow
Account:The
Note:This';
$endresult = preg_match("/\s*Amount:\s*(\d+)/", $arraynext, $match);
echo $match[1];

正则表达式演示：https://regex101.com/r/SA48sm/1/

PHP演示：https://3v4l.org/6jaCV

Answer 3

如果你说你有许多巧合，那么你需要选择所有

preg_match_all('/(?<=Amount:)[\d]{0,}/', $contents, $result);
foreach($result as $res) {
    print_r($res);
}

Answer 4

您可以使用以下正则表达式模式...

(?<=Amount:)\d+

参见 regex demo

PHP （demo）

$regex = '/(?<=Amount:)\d+/';
$arraynext = file_get_contents('data.txt');
preg_match_all($regex, $arraynext, $result);
print_r($result);

Answer 5

使用此模式：/Amount:\K\d+/
它将准确提取每个Amount:后面的完整所需数值，而不使用效率低得多的＆＃34; lookarounds＆＃34;。

我的网页过滤软件不允许我访问您的pastelabs链接，因此我看不到您的实际输入。（这是为什么你应该将输入样本直接发布到你的问题中的众多原因之一。）你声明你有几行必须从中提取，所以这是我测试过的样本输入：

Date:21 Month:03 Year:2017 Amount:50 Category:Grocery Account:bank Note:expensive
Date:1 Month:04 Year:2017 Amount:150 Category:Grocery Account:bank Note:expensive
Date:14 Month:04 Year:2017 Amount:5 Category:Grocery Account:bank Note:expensive
Date:28 Month:04 Year:2017 Amount:5935 Category:Grocery Account:bank Note:expensive

我的模式仅以 48个步骤捕获所需的结果。（Pattern Demo）
该模式使用\K，这意味着＆＃34;保持从这一点开始的角色＆＃34;，所以不需要捕获组，也不需要＆＃ 34; lookbehind＆＃34;。
如果您的实际输入数据在Amount:和数字值之间有可选空格，那么只需将?（空格，然后问号）添加到:之后的模式中{1}}。

与preg_match_all()一起使用时，输出数组与preg_match_all()一样小：一个包含1个子数组和4个元素的数组。我直接切换到我的代码中的子阵列：

代码：（Demo）

$in='Date:21 Month:03 Year:2017 Amount:50 Category:Grocery Account:bank Note:expensive
Date:1 Month:04 Year:2017 Amount:150 Category:Grocery Account:bank Note:expensive
Date:14 Month:04 Year:2017 Amount:5 Category:Grocery Account:bank Note:expensive
Date:28 Month:04 Year:2017 Amount:5935 Category:Grocery Account:bank Note:expensive';

var_export(preg_match_all('/Amount:\K\d+/',$in,$out)?$out[0]:[]);

输出：

array (
  0 => '50',
  1 => '150',
  2 => '5',
  3 => '5935',
)

就本页面上的其他答案而言，它们都以 600 步骤处理我的测试数据（比我的模式慢12倍/效率低）。在这篇文章的时候，其中一个是完全错误的，有些使用草率的正则表达式语法，不应该从中学习。

正则表达式模式与PHP匹配

5 个答案: