我在文本文件中有大量代码,例如:
"4EF\"]\n,\"blue-apple\^&**%
"4EF\"]\n,\"orange\/^4^&**%
如何提取以下数据:
blue-apple
orange
如您所见,数据介于4EF\"]\n,\"
和\
之间。
答案 0 :(得分:1)
您可以使用preg_match_all()
获取所需字符串的一部分:
$str = '"4EF\"]\n,\"blue-apple\^&**%
"4EF\"]\n,\"orange\/^4^&**%';
$str = preg_match_all('~^"4EF\\\\"[^"]+"([^\\\\]+)~m', $str, $matches);
print_r($matches[1]);
正则表达式将在下一个"4EF\"
之后跳过"
+全部,然后使用捕获组保留所有直到下一个反斜杠。
或者:
$str = '"4EF\"]\n,\"blue-apple\^&**%
"4EF\"]\n,\"orange\/^4^&**%';
$str = preg_match_all('~^"4EF\\\\"\]\\\\n,\\\\"([^\\\\]+)~m', $str, $matches);
print_r($matches[1]);
输出:
Array
(
[0] => blue-apple
[1] => orange
)
正则表达式:
~ # delimiter
^ # indicate that the line begins by the following
"4EF # sequence ("4EF)
\\\\ # a backslash
" # a double quote
\] # ']' need to be escaped
\\\\ # a backslash
n, # sequence (n,)
\\\\ # backslash
" # double quote
( # start capture group
[^\\\\]+ # all characters until a backslash
) # end capture group
~ # end delimiter
m # multi line (to use ^)