我正在从网址抓取数据,主要是我经历过ul li
等等。
这次我找到了dl
个标签,当我使用scrape_between函数时,它并没有向我展示我的代码:
<div id='gallery-1' class='gallery galleryid-273 gallery-columns-2 gallery-size-full'><dl class='gallery-item'>
<dt class='gallery-icon portrait'>
<a href='https://example.com/wp-content/uploads/2013/11/gf-1.jpg?fit=650%2C976' data-rel="lightbox-gallery-1"><img width="650" height="976" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" data-src="https://example.com/wp-content/uploads/2013/11/gf-1.jpg?fit=650%2C976" class="attachment-full size-full" alt="" aria-describedby="gallery-1-16311" data-srcset="https://example.com/wp-content/uploads/2013/11/gf-1.jpg?w=650 650w, https://example.com/wp-content/uploads/2013/11/gf-1.jpg?resize=200%2C300 200w" data-sizes="(max-width: 650px) 100vw, 650px" /></a>
</dt>
<dd class='wp-caption-text gallery-caption' id='gallery-1-16311'>
Ground Floor Plan
</dd></dl><dl class='gallery-item'>
<dt class='gallery-icon portrait'>
<a href='https://example.com/wp-content/uploads/2013/11/ff.jpg?fit=649%2C1024' data-rel="lightbox-gallery-1"><img width="649" height="1024" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" data-src="https://example.com/wp-content/uploads/2013/11/ff.jpg?fit=649%2C1024" class="attachment-full size-full" alt="" aria-describedby="gallery-1-16312" data-srcset="https://example.com/wp-content/uploads/2013/11/ff.jpg?w=649 649w, https://example.com/wp-content/uploads/2013/11/ff.jpg?resize=190%2C300 190w" data-sizes="(max-width: 649px) 100vw, 649px" /></a>
</dt>
<dd class='wp-caption-text gallery-caption' id='gallery-1-16312'>
First Floor pLan
</dd></dl><br style="clear: both" />
</div>
有人可以帮帮我吗?
scrap_between函数
function scrape_between($data, $start, $end){
$data = stristr($data, $start);
$data = substr($data, strlen($start));
$stop = stripos($data, $end);
$data = substr($data, 0, $stop);
return $data;
}
我需要抓取dt标签中的图像
我正在尝试此代码
$project_images = scrape_between($data, '<dl class="gallery-item', '<br style="clear: both">');
请建议
答案 0 :(得分:0)
最后我自己得到了解决方案我没有在这里找到任何问题的答案,所以我决定回答这个以帮助其他人
我解决了使用此代码获取dl图像的问题
from atlassian import Confluence
confluence = Confluence(
url='http://localhost:8090',
username='admin',
password='admin')
status = confluence.create_page(
space='DEMO',
title='This is the title',
body='This is the body. You can use <strong>HTML tags</strong>!')
print(status)
答案 1 :(得分:-1)
您可以在函数中使用循环并使其返回数组:
//returns array with found elements
function scrape_between($data, $start, $end) {
$html_array = explode($start, $data);
$clean_html_arr = [];
foreach ($html_array as $position => $html_array_element) {
if ($position > 0) {
$html_exploded = explode($end, $html_array_element);
$clean_html_arr[] = $start . $html_exploded[0] . $end;
}
}
return $clean_html_arr;
}
并像这样使用它:
//Test function
foreach(scrape_between($str, "<dl class='gallery-item'>", "</dl>") as $key => $htmlBlock) {
echo htmlspecialchars($htmlBlock) . '<br><br>';
}