我有一个刮擦表格的功能。表格的<td>
有很多值,当然这些值是不同的。因此,我需要一个字符组合左右,所以无论价值如何,我都可以获得所有内容。
function scrape_between($data, $start, $end){
$data = stristr($data, $start); // Stripping all data from before $start
$data = substr($data, strlen($start)); // Stripping $start
$stop = stripos($data, $end); // Getting the position of the $end of the data to scrape
$data = substr($data, 0, $stop); // Stripping all data from after and including the $end of the data to scrape
return $data; // Returning the scraped data from the function
}
$match = $this -> scrape_between($array, '<td (__MAYBE SOME CHARACTER TO GET EVERYTHING NO MATTER WHAT__) class="V1_c01">', "</td>");
编辑:我想做一个foreach,因为桌子有不同的ID,我想在国外搜索这些。
foreach ($separate_results as $key => $separate_result) {
if ($separate_result != "") {
$table[$key][0]= $this -> scrape_between($separate_result, '<td id="indhold_0_indholdbredvenstre_0_integrationwrapper_1_ctl01_Program_ProgramNormal_Program1_c04_0" class="V1_c04">', "</td>");
}
}
答案 0 :(得分:1)
如果您担心__SOMETHING HERE__
并且类名V2_c01
是否已修复,则以下是我的POC
<?php
function scrape_between($data, $classname, $tagname){
// get anything between `<td` and `classname` whereas `<td` must be the first occurence to the left of `classname`
$openstart = stristr(strrev($data), strrev($classname));
$openstart = substr($openstart, strlen($classname));
$openstart = substr($openstart, 0, stripos($openstart, '<'.$tagname));
$openstart = strrev($openstart);
// get anything between `classname` and `>` whereas `>` must be the first occurence to the right of `classname`
$openend = stristr($data, $classname);
$openend = substr($openend, strlen($classname));
$openend = substr($openend, 0, stripos($openend, '>')+1);
$start = $openstart.$classname.$openend; // '<td __SOMETHING HERE__ class="' . 'V1_c01' . '">'
$end = "</".$tagname.">";
$data = stristr($data, $start); // Stripping all data from before $start
$data = substr($data, strlen($start)); // Stripping $start
$stop = stripos($data, $end); // Getting the position of the $end of the data to scrape
$data = substr($data, 0, $stop); // Stripping all data from after and including the $end of the data to scrape
return $data; // Returning the scraped data from the function
}
$array = '<table><tr><td> </td></tr><tr><td style="" id="td1"><table><tr><td style="" class="V1_c01" id="mytd">my td content</td></tr></table></td></tr><tr><td> </td></tr></table>';
$match = scrape_between($array, 'V1_c01', "td");
echo $match;
echo '<br />';
$array = '<table><tr><td> </td></tr><tr><td style="" id="td1"><table><tr><td><span style="" class="V1_c01" id="myspan">my span content</span></td></tr></table></td></tr><tr><td> </td></tr></table>';
$match = scrape_between($array, 'V1_c01', "span");
echo $match;
?>
结果一:
my td content
结果二:
my span content