如果有一大块HTML在<div>
和<table>
中很好地显示数据,那么如何删除所有HTML / CSS标记,同时保持最初在单个单元格中找到的文本和div现在只用换行?
此处显示的当前尝试将输出一个长连续段落,而不是在div或表格形式中保持分隔。
原始HTML: http://pastebin.com/63N3Kg16
输出
John Smith | SomeName Realty | (xxx) 939-4835 Allston St, Cambridge, MA Very spacious under renovation with SST/Granite, porch, minutes to MIT, redline, Nov/1 4BR/1BA Apartment $3,400/month Bedrooms 4 Bathrooms 1 full, 0 partial Sq Footage Unspecified Parking None Pet Policy No pets Deposit $0 DESCRIPTION Triple decker building secondfloor apt aprox 2000 sqf with large bedrooms, kitchen, pantry, porch, d/w, all woodfloor and ZTilded in the kitchen, new bath. utilities extra,Nov/1 see additional photos below Contact info: Payman Ahmadifar Bayside Realty (xxx) 939-4835 Posted: Sep 24, 2012, 6:55am PDT
PHP
nl2br(trim(strip_tags($html)));
预期输出
包含<br>
或换行符,无<div>
或<table>
HTML标记的纯文本。基本上是为了使文本更具可读性,保持原始的间距/分隔结构,但除了<br>
之外没有CSS样式或HTML标记。
John Smith | SomeName Realty | (xxx) 939-4835
Allston St, Cambridge, MA
Very spacious under renovation with SST/Granite, porch, minutes to MIT, redline, Nov/1
4BR/1BA Apartment $3,400/month
Bedrooms 4
Bathrooms 1 full, 0 partial
Sq Footage Unspecified
Parking None
Pet Policy No pets
Deposit $0
DESCRIPTION
Triple decker building secondfloor apt aprox 2000 sqf with large bedrooms, kitchen, pantry, porch, d/w, all woodfloor and ZTilded in the kitchen, new bath. utilities extra,Nov/1 see additional photos below
Contact info: Payman Ahmadifar Bayside Realty (xxx) 939-4835
Posted: Sep 24, 2012, 6:55am PDT
答案 0 :(得分:1)
你可以玩一些字符串操作
尝试
$string = strip_tags($html);
$string = str_replace(chr(32).chr(32).chr(32),"*****",$string);
$newString = array_map(function($var){ return trim(preg_replace('!\s+!', ' ',$var)); },explode("*****",$string));
print(implode("\n", $newString));