Question

假设我有字符串变量：

$str = '
[WhiteTitle "GM"]
[WhiteCountry "Cuba"]
[BlackCountry "United States"]

1. d4 d5 2. Nf3 Nf6 3. e3 c6 4. c4 e6 5. Nc3 Nbd7 6. Bd3 Bd6
7. O-O O-O 8. e4 dxe4 9. Nxe4 Nxe4 10. Bxe4 Nf6 11. Bc2 h6
12. b3 b6 13. Bb2 Bb7 14. Qd3 g6 15. Rae1 Nh5 16. Bc1 Kg7
17. Rxe6 Nf6 18. Ne5 c5 19. Bxh6+ Kxh6 20. Nxf7+ 1-0
';

我想从该变量中提取一些信息，如下所示：

Array {
    ["WhiteTitle"] => "GM",
    ["WhiteCountry"] => "Cuba",
    ["BlackCountry"] => "United States"
}

感谢。

Answer 1

您可以使用：

preg_match_all('/\[(.*?) "(.*?)"\]/m', $str, $matches, PREG_SET_ORDER);
print_r($matches);

它将为您提供阵列中的所有匹配，0键将完全匹配，第一个键将是第一个部分，第二个键将是第二个部分：

Output:

Array
(
    [0] => Array
        (
            [0] => [WhiteTitle "GM"]
            [1] => WhiteTitle
            [2] => GM
        )

    [1] => Array
        (
            [0] => [WhiteCountry "Cuba"]
            [1] => WhiteCountry
            [2] => Cuba
        )

    [2] => Array
        (
            [0] => [BlackCountry "United States"]
            [1] => BlackCountry
            [2] => United States
        )
)

如果你想要它的格式，你可以使用简单的循环：

$array = array();
foreach($matches as $match){
    $array[$match[1]] = $match[2];
}
print_r($array);

Output:

Array
(
    [WhiteTitle] => GM
    [WhiteCountry] => Cuba
    [BlackCountry] => United States
)

Answer 2

您可以使用类似：

的内容

<?php
$string = <<< EOF
[WhiteTitle "GM"]
[WhiteCountry "Cuba"]
[BlackCountry "United States"]
1. d4 d5 2. Nf3 Nf6 3. e3 c6 4. c4 e6 5. Nc3 Nbd7 6. Bd3 Bd6
7. O-O O-O 8. e4 dxe4 9. Nxe4 Nxe4 10. Bxe4 Nf6 11. Bc2 h6
12. b3 b6 13. Bb2 Bb7 14. Qd3 g6 15. Rae1 Nh5 16. Bc1 Kg7
17. Rxe6 Nf6 18. Ne5 c5 19. Bxh6+ Kxh6 20. Nxf7+ 1-0
EOF;

$final = array();
preg_match_all('/\[(.*?)\s+(".*?")\]/', $string, $matches, PREG_PATTERN_ORDER);
for($i = 0; $i < count($matches[1]); $i++) {
    $final[$matches[1][$i]] = $matches[2][$i];
}

print_r($final);

<强>输出：

Array
(
    [WhiteTitle] => "GM"
    [WhiteCountry] => "Cuba"
    [BlackCountry] => "United States"
)

Ideone演示：

http://ideone.com/wQYshT

正则表达式说明：

\[(.*?)\s+(".*?")\]

Match the character “[” literally «\[»
Match the regex below and capture its match into backreference number 1 «(.*?)»
   Match any single character that is NOT a line break character (line feed) «.*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) «\s+»
   Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regex below and capture its match into backreference number 2 «(".*?")»
   Match the character “"” literally «"»
   Match any single character that is NOT a line break character (line feed) «.*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
   Match the character “"” literally «"»
Match the character “]” literally «\]»

Answer 3

这是一个更安全，更紧凑的解决方案：

$re = '~\[([^]["]*?)\s*"([^]"]+)~';   // Defining the regex
$str = "[WhiteTitle \"GM\"]\n[WhiteCountry \"Cuba\"]\n[BlackCountry \"United States\"]\n\n1. d4 d5 2. Nf3 Nf6 3. e3 c6 4. c4 e6 5. Nc3 Nbd7 6. Bd3 Bd6\n7. O-O O-O 8. e4 dxe4 9. Nxe4 Nxe4 10. Bxe4 Nf6 11. Bc2 h6\n12. b3 b6 13. Bb2 Bb7 14. Qd3 g6 15. Rae1 Nh5 16. Bc1 Kg7\n17. Rxe6 Nf6 18. Ne5 c5 19. Bxh6+ Kxh6 20. Nxf7+ 1-0"; 
preg_match_all($re, $str, $matches);  // Getting all matches
print_r(array_combine($matches[1],$matches[2])); // Creating the final array with array_combine

请参阅IDEONE PHP demo和regex demo。

正则表达式详细信息：

\[ - 开启[
([^]["]*?) - 第1组匹配除"，[和]以外的0 +字符，尽可能少
\s* - 0+空格（修剪第一个值）
" - 双引号
([^]"]+) - 第2组匹配除]和"以外的1 +个字符

通过PHP和REGEXP提取文本片段

3 个答案: