正则表达式不是我最强的套装,我在这种情况下遇到了一些麻烦。
我有以下字符串:
locale (district - town) [parish]
我需要提取以下信息: 1 - 区域设置 2区 3 - 镇
我有这些解决方案:
1 - locale
preg_match("/([^(]*)\s/", $input_line, $output_array);
2 - 区
preg_match("/.*\(([^-]*)\s/", $input_line, $output_array);
3 - 镇
preg_match("/.*\-\s([^)]*)/", $input_line, $output_array);
这些似乎工作得很好。 但是,字符串可以像以下任何一样呈现:
localeA(localeB) (district - town) [parish]
locale (district - townA(townB)) [parish]
locale (district - townA-townB) [parish]
Locale还可以包含自己的括号。 城镇可以包括括号和/或自己的连字符。
这使得难以提取正确的信息。在上面的3个场景中,我将不得不提取:
localeA(localeB)+ district + town
locale + district + townA(townB)
locale + district + townA-townB
我觉得很难处理所有这些情况。你能救我一下吗?
提前致谢
答案 0 :(得分:0)
如果区域设置,区域和城镇中没有空格:
preg_match("/^\s*(\S+)\s*\((\S+)\s*-\s*(\S+)\)/", $input_line, $output_array);
<强>解释强>
The regular expression:
(?-imsx:^\s*(\S+)\s*\((\S+)\s*-\s*(\S+)\))
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
\S+ non-whitespace (all but \n, \r, \t, \f,
and " ") (1 or more times (matching the
most amount possible))
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
\( '('
----------------------------------------------------------------------
( group and capture to \2:
----------------------------------------------------------------------
\S+ non-whitespace (all but \n, \r, \t, \f,
and " ") (1 or more times (matching the
most amount possible))
----------------------------------------------------------------------
) end of \2
----------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
- '-'
----------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
( group and capture to \3:
----------------------------------------------------------------------
\S+ non-whitespace (all but \n, \r, \t, \f,
and " ") (1 or more times (matching the
most amount possible))
----------------------------------------------------------------------
) end of \3
----------------------------------------------------------------------
\) ')'
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
答案 1 :(得分:0)
不确定您的规则和边缘情况究竟是什么,但这适用于提供的示例
preg_match('#^(.+?) \((.+?) - (.+?)\) \[(.+)\]$#',$str,$matches);
给出这些结果(当为$str
中的每个示例字符串运行时):
Array
(
[0] => locale (district - town) [parish]
[1] => locale
[2] => district
[3] => town
[4] => parish
)
Array
(
[0] => localeA(localeB) (district - town) [parish]
[1] => localeA(localeB)
[2] => district
[3] => town
[4] => parish
)
Array
(
[0] => locale (district - townA(townB)) [parish]
[1] => locale
[2] => district
[3] => townA(townB)
[4] => parish
)
Array
(
[0] => locale (district - townA-townB) [parish]
[1] => locale
[2] => district
[3] => townA-townB
[4] => parish
)