将HTML拆分为数组(PHP)

时间:2018-01-10 14:43:53

标签: php

我必须从函数中接收返回HTML的数据。最好的解决方案是编辑代码以更好地工作,但它是实时代码而不是我的代码。我无法编辑此内容。

我可以获得一些指导来帮助实现我的需求:

返回html:

<a href="/newsitems">News</a>
<a href="/news/roman-catapults/16465">Roman Catapults</a>
<a href="/news/year-3-roman-experience/13835">Year 3 Roman Experience</a>
<a href="/news/year-3-dewa-roman-experience/15746">Year 3 Dewa Roman Experience</a>
<a href="/news/science-week-day-1/15423">Science Week</a><a href="/news/world-book-day/15104">World Book Day</a>
<a href="/news/year-6-trip-to-the-lion-salt-works/15762">Year 6 trip to the Lion Salt Works</a><a href="/news/learning-logs/13839">Learning Logs</a>
<a href="/news/working-together/13838">Working Together</a>
<a href="/news/learning-logs/13837">Learning Logs</a>
<a href="/news/year-2-curriculum-map-for-autumn-2/13377">Year 2 Curriculum Map for Autumn 2</a> 

我知道有像

这样的方法
  • 正则表达式
  • 爆炸
  • 内爆

然而,我对此的了解并不是最好的,我希望得到一些指导来帮助我学习。

我想要实现的目标:

  • 试图将每一行分成数组
  • 从链接中获取文本以及链接
  • E.g。第一行array => (title => "News", link => "/newsitems")

原因:

我无法编辑返回此HTML的函数,我希望显示的HTML比返回的数据更好。

1 个答案:

答案 0 :(得分:1)

使用PHP HTML Parser将是解决您问题的最强大的解决方案。但是,如果您只想快速一次性将示例html分解为数组,则可以在新行上使用select QD.id, Q.quiz_name, S.firstname, S.lastname from #quiz_details QD inner join #quiz Q on Q.id = QD.quiz_id inner join #student S on S.id = QD.student_id ,如下所示:

explode()

如果要进一步解析数组项以便拆分链接和元素文本,可以执行以下操作:

$html = '<a href="/newsitems">News</a>
<a href="/news/roman-catapults/16465">Roman Catapults</a>
<a href="/news/year-3-roman-experience/13835">Year 3 Roman Experience</a>
<a href="/news/year-3-dewa-roman-experience/15746">Year 3 Dewa Roman Experience</a>
<a href="/news/science-week-day-1/15423">Science Week</a><a href="/news/world-book-day/15104">World Book Day</a>
<a href="/news/year-6-trip-to-the-lion-salt-works/15762">Year 6 trip to the Lion Salt Works</a><a href="/news/learning-logs/13839">Learning Logs</a>
<a href="/news/working-together/13838">Working Together</a>
<a href="/news/learning-logs/13837">Learning Logs</a>';

$array = explode("\n",$html);
$array = array_map('trim',$array);

$final = array(); foreach($array as $v){ $v = trim($v); // capture things in the href attribute and within the tags preg_match('/href="([^"]*)">([^<]*)<\/a>/',$v,$matches); $final[] = array( 'originalelement' => $v, 'url' => $matches[1], 'text' => $matches[2] ); } 现在将拥有您正在寻找的内容,例如:

$final

请注意,此解决方案将使用您在此处列出的html 工作,但html是一个狡猾的野兽,如果array( array( "originalelement" => "<a href="/newsitems">News</a>", "url" => "/newsitems", "text" => "News" ), array( "originalelement" => "<a href="/news/roman-catapults/16465">Roman Catapults</a>", "url" => "/news/roman-catapults/16465", "text" => "Roman Catapults" ) ) 元素具有嵌套元素(例如ab或span),正则表达式不会捕获那些。