我试图获取数组中的数字。这是我的字符串和我的代码。
$split_times = "return escape('<table class=\'split\' ><tr><td class=\'split0\'>50m</td><td class=\'split1\'>28.86</td><td class=\'split2\'>28.86</td></tr><tr><td class=\'split0\'>100m</td><td class=\'split1\'>1:01.56</td><td class=\'split2\'>32.70</td></tr><tr><td class=\'splitsep\' colspan=\'3\'></td></tr><tr><td class=\'split0\'>150m</td><td class=\'split1\'>1:36.88</td><td class=\'split2\'>35.32</td></tr><tr><td class=\'split0\'>200m</td><td class=\'split1\'>2:59:09.93</td><td class=\'split2\'>33.05</td></tr></table>')";
preg_match_all("/split1\\\'>(\d+(?:\.\d+)?)</", $split_times, $split_times_distances);
print_r($split_times_distances);
它应该返回一个如下数组:
Array
(
[0] => Array
(
[0] => split1\'>28.86<
[1] => split1\'>1:01.56<
[2] => split1\'>1:36.88<
[3] => split1\'>2:59:09.93<
)
[1] => Array
(
[0] => 28.86
[1] => 1:01.56
[2] => 1:36.88
[3] => 2:59:09.93
)
)
但相反,它只显示两个数组的第一个索引。
答案 0 :(得分:1)
您的正则表达式与=\'split1\'>1:36.88<
您必须在开头添加(?:\d+:){0,2}
。
$split_times = "return escape('<table class=\'split\' ><tr><td class=\'split0\'>50m</td><td class=\'split1\'>28.86</td><td class=\'split2\'>28.86</td></tr><tr><td class=\'split0\'>100m</td><td class=\'split1\'>1:01.56</td><td class=\'split2\'>32.70</td></tr><tr><td class=\'splitsep\' colspan=\'3\'></td></tr><tr><td class=\'split0\'>150m</td><td class=\'split1\'>1:36.88</td><td class=\'split2\'>35.32</td></tr><tr><td class=\'split0\'>200m</td><td class=\'split1\'>2:59:09.93</td><td class=\'split2\'>33.05</td></tr></table>')";
preg_match_all("/split1\\\'>((?:\d+:){0,2}\d+(?:\.\d+)?)</", $split_times, $split_times_distances);
// here __^^^^^^^^^^^^^
print_r($split_times_distances);
<强>输出:强>
Array
(
[0] => Array
(
[0] => split1\'>28.86<
[1] => split1\'>1:01.56<
[2] => split1\'>1:36.88<
[3] => split1\'>2:59:09.93<
)
[1] => Array
(
[0] => 28.86
[1] => 1:01.56
[2] => 1:36.88
[3] => 2:59:09.93
)
)
答案 1 :(得分:1)
您已使用onMouse...
从 DOMDocument
属性中提取字符串,为什么不继续?
如果不使用专用的Javascript解析器,它很容易提取Javascript字符串,那么您所要做的就是删除转义的引号以获得&#34; raw&#34;字符串:
$onMouseAttr = "return escape('<table class=\'split\' ><tr><td class=\'split0\'>50m</td><td class=\'split1\'>28.86</td><td class=\'split2\'>28.86</td></tr><tr><td class=\'split0\'>100m</td><td class=\'split1\'>1:01.56</td><td class=\'split2\'>32.70</td></tr><tr><td class=\'splitsep\' colspan=\'3\'></td></tr><tr><td class=\'split0\'>150m</td><td class=\'split1\'>1:36.88</td><td class=\'split2\'>35.32</td></tr><tr><td class=\'split0\'>200m</td><td class=\'split1\'>2:59:09.93</td><td class=\'split2\'>33.05</td></tr></table>')";
# first step: extracting the strings
$stringPattern = <<<'EOD'
~ " ( [^"\\]* (?:\\.[^"\\]*)* ) " | ' ( [^'\\]* (?:\\.[^'\\]*)* ) ' ~xsS
EOD;
if ( preg_match_all($stringPattern, $onMouseAttr, $matches, PREG_SET_ORDER) ) {
foreach ($matches as $match) {
# unescape the string for the correct quote
$html = isset($match[2]) ? str_replace("\\'", "'", $match[2])
: str_replace('\\"', '"', $match[1]);
# extract the nodes you want with DOMDocument/DOMXPath
$dom = new DOMDocument;
$dom->loadHTML($html);
$xp = new DOMXPath($dom);
$nodeList = $xp->query('//td[@class="split1"]');
foreach ($nodeList as $node) {
# display them
echo $node->nodeValue, PHP_EOL;
# or store them
# $results[] = $node->nodeValue;
}
}
}
答案 2 :(得分:0)
我做的是:只需选择两个字符串之间的所有字符。在这种情况下,基于<td>
类名。
print_r(preg_match_all("/split1\\\'>(.*?)</", $split_times, $split_times_distances));
Array
(
[0] => Array
(
[0] => split1\'>28.86<
[1] => split1\'>1:01.56<
[2] => split1\'>1:36.88<
[3] => split1\'>2:59:09.93<
)
[1] => Array
(
[0] => 28.86
[1] => 1:01.56
[2] => 1:36.88
[3] => 2:59:09.93
)
)