如何删除此字符串:
class="size-full wp-image-1561 "
来自这个字符串:
class="size-full wp-image-1561 " alt="Class-A warehouse facility developed by Panattoni Europe in Germany for Rudolph Logistik Gruppe." src="http://europe-re.com/wp-content/uploads/2012/12/Class-A-warehouse-facility-developed-by-Panattoni-Europe-in-Germany-for-Rudolph-Logistik-Gruppe.jpg"
考虑班级改变每条记录。我怎么能动态地做到这一点?
类似于完整字符串中的"remove class="(whatever is inside here)"
。
提前谢谢!
答案 0 :(得分:1)
如果正则表达式是您手头的唯一方法,则可以匹配
with open('data.txt','rb') as f1:
reader = csv.reader(f1, delimiter='\t')
for group, rows in groupby(reader, itemgetter(0)):
best = max(rows, key=lambda r: (float(r[4]), float(r[2])))
print best
用空字符串替换。 \bclass="[^"]*"\s*
确保我们匹配\b
,而不是class
。
使用subclass
,我们可以匹配[^"]*
以外的0个或更多字符。
使用"
,我们可以自动修剪字符串。
请参阅demo
但是,如果您处理PHP,最好使用DOMDocument。类似于
的东西\s*
$html = <<<HTML
<div id="res">Some text inside DIV
<img
class="size-full wp-image-1561 "
alt="Class-A warehouse facility developed by Panattoni Europe in Germany for Rudolph Logistik Gruppe."
src="http://europe-re.com/wp-content/uploads/2012/12/Gruppe.jpg">
<img
alt="Class-A warehouse facility developed by Panattoni Europe in Germany for Rudolph Logistik Gruppe."
src="http://europe-re.com/wp-content/uploads/2012/12/Gruppe.jpg">
</div>
HTML;
$dom = new DOMDocument();
@$dom->loadHTML($html);
$dom->preserveWhiteSpace = false;
$images = $dom->getElementsByTagName('img');
$imgs = array();
foreach($images as $img) {
if ($img->attributes->getNamedItem("class") != null) {
$imgs[] = $img;
}
}
foreach($imgs as $img) {
$img->parentNode->removeChild($img);
}
$str = $dom->saveHTML();
echo $str;
答案 1 :(得分:0)
答案 2 :(得分:0)
如果每个记录的类更改但格式保持不变,我会寻求更快的解决方案。只需检查第二个&#34;&#39;&#39;的位置。并寻找一个子串:
substr($str, 8+strpos(substr($str,7), '"'))
如果格式发生变化,请查看preg_replace()