preg_replace更多行php

时间:2015-12-14 17:19:36

标签: php preg-replace html-parsing

代码:

    kpi_date    ssaname              bts_name  call_v  call_d
0  2015-09-01  Bangalore   1002_NUc_Marathalli    8962    0.62
1  2015-09-03  Bangalore   1002_NUc_Marathalli    6567    1.19
2  2015-09-02  Bangalore   1002_NUc_Marathalli    7033    0.63
3  2015-09-01  Bangalore  1003_IU2_Munnekolalu    4659    1.17
4  2015-09-02  Bangalore  1003_IU2_Munnekolalu    6671    0.46

df.groupby(['bts_name','kpi_date']).mean().stack().unstack(level=1).unstack(level=1)

kpi_date             2015-09-01        2015-09-02        2015-09-03       
                        call_v call_d     call_v call_d     call_v call_d
bts_name                                                                  
1002_NUc_Marathalli        8962   0.62       7033   0.63       6567   1.19
1003_IU2_Munnekolalu       4659   1.17       6671   0.46        NaN    NaN

它的工作,只有帖子只有一行。我想使用多行:

实施例: 现在我编辑这篇文章

$message = preg_replace("/<div style='background-color:#C0C8D0;width:95%;'>SMA Forr&aacute;sk&oacute;d: <a href='' onclick='selectcode\\((.*)\\);return false;'>\\[ Mindet kijelol \\]<\\/a><\\/div><div id='(.*)' style=\"width:95%;max-width:95%;max-height: 500px; overflow:scroll;background-color: #FFFFFF;\"><pre class=\"sma\" style=\"font-family:monospace;font-size: 12px;\"><ol><li style=\"font-weight: normal; vertical-align:top;\"><div style=\"font: normal normal 1em\\/1\\.2em monospace; margin:0; padding:0; background:none; vertical-align:top;\">(.*)<\\/div><\\/li><\\/ol><\\/pre><\\/div>/", '[sma]<pre>$3</pre>[/sma]',$message);

很好,我有更多的线条: 例如:

->

[sma]Now i edit this post[/sma]

此输出:

line1
line2
line3
line4
line5
line6

我希望:

[sma]line1line2line3line4line5line6[/sma]

多行html输出:

[sma]line1

line2

line3

line4

line5

line6
[/sma]

1 个答案:

答案 0 :(得分:0)

I believe you are either trying to learn regex parsing in PHP or you are trying to parse HTML to make something out of it. I did it once to make an XML generator named hFeeds (Check the Development Branch for my latest commits). You should have a look at its code, in case you are trying to achieve the same. [Note: I stopped working on it a long time ago, because I developed another better one using Laravel framework and currently empowering my website MonitorKashmir.com namely haaput.

As the comments above suggested, parsing HTML using regular expressions is hardly recommended. In most of the cases, you should use HTML/XML Parsers, as suggested above, e.g; SimpleXML provided within PHP.

Some suggestions:

  1. Study about SimpleXML and its usage.
  2. Use Regex101.com to check out regular expressions and its Code Generator to generate PHP code (at-least for time-being)

Anyways for the problem above, if we analyse, we need a pattern that repeats itself. In this case, it is:

<div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">[CONTENT TO BE CAPTURED]</div>

So we only need to capture the [CONTENT TO BE CAPTURED] part and we should be fine. As per all the information provided here-in, I assume that the [CONTENT TO BE CAPTURED] is only alpha-numeric, that is, it contains only Letters & digits until next </div> is encountered.

So the solution for the problem will be the following {$str can contain content from some url, e.g;

$str = file_get_contents("http://www.example.com/example.html");

And it can be replaced in the following code accordingly }.

$re = "/<div style=\"font: normal normal 1em\\/1\\.2em monospace; margin:0; padding:0; background:none; vertical-align:top;\">([[:alnum:]]*)<\\/div>/"; 
$str = "<div style='background-color:#C0C8D0;width:95%;'>SMA Forr&aacute;sk&oacute;d: <a href='' onclick='selectcode(93347);return false;'>[ Mindet kijelol ]</a></div><div id='93347' style=\"width:95%;max-width:95%;max-height: 500px; overflow:scroll;background-color: #FFFFFF;\"><pre class=\"sma\" style=\"font-family:monospace;font-size: 12px;\"><ol><li style=\"font-weight: normal; vertical-align:top;\"><div style=\"font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;\">line1</div></li><li style=\"font-weight: bold; vertical-align:top;\"><div style=\"font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;\">line2</div></li><li style=\"font-weight: normal; vertical-align:top;\"><div style=\"font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;\">line3</div></li><li style=\"font-weight: bold; vertical-align:top;\"><div style=\"font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;\">line4</div></li><li style=\"font-weight: normal; vertical-align:top;\"><div style=\"font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;\">line5</div></li><li style=\"font-weight: bold; vertical-align:top;\"><div style=\"font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;\">line6</div></li></ol></pre></div>\n"; 

preg_match_all($re, $str, $matches);