如何从文本文件中提取这些字符?

时间:2018-04-08 20:33:28

标签: php preg-match preg-match-all

我在文本文件中有大量代码,例如:

"4EF\"]\n,\"blue-apple\^&**%
"4EF\"]\n,\"orange\/^4^&**%

如何提取以下数据:

blue-apple
orange

如您所见,数据介于4EF\"]\n,\"\之间。

1 个答案:

答案 0 :(得分:1)

您可以使用preg_match_all()获取所需字符串的一部分:

$str = '"4EF\"]\n,\"blue-apple\^&**%
"4EF\"]\n,\"orange\/^4^&**%';

$str = preg_match_all('~^"4EF\\\\"[^"]+"([^\\\\]+)~m', $str, $matches);
print_r($matches[1]);

正则表达式将在下一个"4EF\"之后跳过" +全部,然后使用捕获组保留所有直到下一个反斜杠。

或者:

$str = '"4EF\"]\n,\"blue-apple\^&**%
"4EF\"]\n,\"orange\/^4^&**%';
$str = preg_match_all('~^"4EF\\\\"\]\\\\n,\\\\"([^\\\\]+)~m', $str, $matches);
print_r($matches[1]);

输出:

Array
(
    [0] => blue-apple
    [1] => orange
)

正则表达式:

~          # delimiter
^          # indicate that the line begins by the following
"4EF       # sequence ("4EF)
\\\\       # a backslash
"          # a double quote
\]         # ']' need to be escaped 
\\\\       # a backslash
n,         # sequence (n,)
\\\\       # backslash
"          # double quote
(          # start capture group
  [^\\\\]+ # all characters until a backslash
)          # end capture group
~          # end delimiter
m          # multi line (to use ^)