我有一个HTML
文件,需要从中提取某些数据。我正在使用正则表达式,这似乎很简单。我有两种类型的数据需要提取。我需要日期和交易。我想打印出特定日期的所有交易。但是由于日期编号以及交易都未编号。我不知道如何遍历两者。
老实说,我已经花了几个小时摸索着头,无法解决。
$balpage=curl_exec($ch);
for($i=0;$i<count( );$i++){
if(preg_match('#<\s*?strong\b[^>]*>(.*?)
</strong\b[^>]*>#s',$balpage)==1){
preg_match('#<\s*?strong\b[^>]*>(.*?)
</strong\b[^>]*>#s',$balpage,$date);
preg_match('#\<span class=\"issecureoff\"\>(.+?)\
<\/span\>#s',$balpage,$transactions);
print_r($date[1][$i]);
print_r($transactions[1][$i]);
}
您看到的代码非常不完整,甚至无法正常工作,但我只是想说一下它应该是什么样子?抱歉,我真的很陌生,所以如果您编码的大师可以帮助我,那就太好了。
答案 0 :(得分:4)
我的猜测是,您可能只想使用preg_match_all
,但我们也可以略微修改表达式:
<\s*strong[^>]*>(.*?)<\/strong[^>]*>|<span class="issecureoff">(.+?)<\/span>
$re = '/<\s*strong[^>]*>(.*?)<\/strong[^>]*>|<span class="issecureoff">(.+?)<\/span>/m';
$str = '<i class="uk-icon-calendar"></i><strong>2019.06.04</strong></td>
</tr>
<tr>
<td>
09:35
</td>
<td>
орлого
</td>
<td class="text-16 uk-text-nowrap">
<span class="issecureoff">0.00</span>
<span class="issecureon">*</span>
</td>
<td class="text-green uk-text-nowrap">
<span class="issecureoff">5,000.00</span>
<span class="issecureon">*</span>
<img src="Content/img/arrow_up.png" width="8"></td>
<td class="text-16 uk-text-nowrap">
<span class="issecureoff">5,000.00</span>
<span class="issecureon">*</span>
</td>
<td class="text-16 uk-text-nowrap uk-text-right"> </td>
</tr>
<tr>
<td>
09:35
</td>
<td>
Ухаалаг мэдээ үйлчилгээний хураамж
</td>
<td class="text-16 uk-text-nowrap">
<span class="issecureoff">5,000.00</span>
<span class="issecureon">*</span>
</td>
<td class="text-red uk-text-nowrap">
<span class="issecureoff">-50.00</span>
<span class="issecureon">*</span>
<img src="Content/img/arrown_down.png" width="8"></td>
<td class="text-16 uk-text-nowrap">
<span class="issecureoff">4,950.00</span>
<span class="issecureon">*</span>
</td>
<td class="text-16 uk-text-nowrap uk-text-right"> </td>
</tr>
<tr>
<td colspan="6" class="text-12 letter-space-1"><i class="uk-icon-calendar"></i><strong>2019.06.14</strong></td>
</tr>
<tr>
<td>
11:00
</td>
<td>
batidert
</td>
<td class="text-16 uk-text-nowrap">
<span class="issecureoff">4,950.00</span>
<span class="issecureon">*</span>
</td>
<td class="text-green uk-text-nowrap">
<span class="issecureoff">50,000.00</span>
<span class="issecureon">*</span>
<img src="Content/img/arrow_up.png" width="8"></td>
<td class="text-16 uk-text-nowrap">
<span class="issecureoff">54,950.00</span>
<span class="issecureon">*</span>
</td>
<td class="text-16 uk-text-nowrap uk-text-right"> 5028604392</td>
</tr>
';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
// Print the entire match result
var_dump($matches);
array(11) {
[0]=>
array(2) {
[0]=>
string(27) "<strong>2019.06.04</strong>"
[1]=>
string(10) "2019.06.04"
}
[1]=>
array(3) {
[0]=>
string(37) "<span class="issecureoff">0.00</span>"
[1]=>
string(0) ""
[2]=>
string(4) "0.00"
}
[2]=>
array(3) {
[0]=>
string(41) "<span class="issecureoff">5,000.00</span>"
[1]=>
string(0) ""
[2]=>
string(8) "5,000.00"
}
[3]=>
array(3) {
[0]=>
string(41) "<span class="issecureoff">5,000.00</span>"
[1]=>
string(0) ""
[2]=>
string(8) "5,000.00"
}
[4]=>
array(3) {
[0]=>
string(41) "<span class="issecureoff">5,000.00</span>"
[1]=>
string(0) ""
[2]=>
string(8) "5,000.00"
}
[5]=>
array(3) {
[0]=>
string(39) "<span class="issecureoff">-50.00</span>"
[1]=>
string(0) ""
[2]=>
string(6) "-50.00"
}
[6]=>
array(3) {
[0]=>
string(41) "<span class="issecureoff">4,950.00</span>"
[1]=>
string(0) ""
[2]=>
string(8) "4,950.00"
}
[7]=>
array(2) {
[0]=>
string(27) "<strong>2019.06.14</strong>"
[1]=>
string(10) "2019.06.14"
}
[8]=>
array(3) {
[0]=>
string(41) "<span class="issecureoff">4,950.00</span>"
[1]=>
string(0) ""
[2]=>
string(8) "4,950.00"
}
[9]=>
array(3) {
[0]=>
string(42) "<span class="issecureoff">50,000.00</span>"
[1]=>
string(0) ""
[2]=>
string(9) "50,000.00"
}
[10]=>
array(3) {
[0]=>
string(42) "<span class="issecureoff">54,950.00</span>"
[1]=>
string(0) ""
[2]=>
string(9) "54,950.00"
}
}