PHP - 从HTML数组子项中提取多个信息

时间:2015-01-28 08:51:58

标签: php arrays

我有一个名为$ topPaid的数组:

Array
(
    [0] => <li>
        <a class="livelink" href="#%21/content/5664">
            <span title="Relief Terrain Pack v3.2" class="title">Relief Terrain Pack v3.2</span>
            <br><small>
                    Editor Extensions/Terrain
            </small>
            <br></a>
    </li>
    [1] => <li>
        <a class="livelink" href="#%21/content/368">
            <span title="Playmaker" class="title">Playmaker</span>
            <br><small>
                    Editor Extensions/Visual Scripting
            </small>
            <br></a>
    </li>
    [2] => <li>
        <a class="livelink" href="#%21/content/4243">
            <span title="Amplify Motion" class="title">Amplify Motion</span>
            <br><small>
                    Scripting/Effects
            </small>
            <br></a>
    </li>
    [3] => <li>
        <a class="livelink" href="#%21/content/16899">
            <span title="Skele: Character Animation Tools" class="title">Skele: Character Animation Tools</span>
            <br><small>
                    Editor Extensions/Modeling
            </small>
            <br></a>
    </li>
    [4] => <li>
        <a class="livelink" href="#%21/content/19245">
            <span title="SnazzyGrid" class="title">SnazzyGrid</span>
            <br><small>
                    Editor Extensions/Utilities
            </small>
            <br></a>
    </li>
    [5] => <li>
        <a class="livelink" href="#%21/content/19352">
            <span title="Zones, Fields, and Shields" class="title">Zones, Fields, and Shields</span>
            <br><small>
                    Shaders
            </small>
            <br></a>
    </li>
    [6] => <li>
        <a class="livelink" href="#%21/content/18920">
            <span title="PlayerPrefs Elite" class="title">PlayerPrefs Elite</span>
            <br><small>
                    Scripting/Integration
            </small>
            <br></a>
    </li>
    [7] => <li>
        <a class="livelink" href="#%21/content/18358">
            <span title="Bolt" class="title">Bolt</span>
            <br><small>
                    Scripting/Network
            </small>
            <br></a>
    </li>
    [8] => <li>
        <a class="livelink" href="#%21/content/13198">
            <span title="BIG Environment Pack Vol.3" class="title">BIG Environment Pack Vol.3</span>
            <br><small>
                    3D Models/Environments
            </small>
            <br></a>
    </li>
    [9] => <li>
        <a class="livelink" href="#%21/content/23930">
            <span title="VertExmotion" class="title">VertExmotion</span>
            <br><small>
                    Editor Extensions/Animation
            </small>
            <br></a>
    </li>
)

现在我尝试取出“href”链接,“标题”和“小”中的文本,以便在带有此代码的表格中显示它们:

foreach($topPaid as $key => $value)
{
    $xml = simplexml_load_string($key);
    $list = $xml->xpath("//@href");
    $preparedUrls = array();
    foreach($list as $item) {
        $item = parse_url($item);
        $preparedUrls[] = $item['scheme'] . '://' .  $item['host'] . '/';
    }
    print_r($preparedUrls);
}

但我总是得到我尝试访问非成员对象的错误。我应该逐行解析每个数组元素并解析它的行内容还是什么是更好的方法来获取信息?

格尔茨,

2 个答案:

答案 0 :(得分:0)

我没有尝试你的代码,但在我看来,错误符合:

$xml = simplexml_load_string($key);

应该是:

$xml = simplexml_load_string($value);

并且您传递给该函数的文本不是有效的XML,<br>应为<br />

答案 1 :(得分:0)

通过检查以下行解决:

foreach($topPaid as $key => $value)
{
    foreach(preg_split("/((\r?\n)|(\r\n?))/", $value) as $line)
    {
        $difline = strip_tags($line);
        if(strpos($line, '<a class="livelink" href="#%21/') !== false)
        {
            $link = explode('href="', $line);
            $link = substr($link[1], 0, -2);
            // https://www.assetstore.unity3d.com/en/#!/content/4243
            $link = str_replace('#%21', 'https://www.assetstore.unity3d.com/en/#!', $link);
            //print_r($link);
        }
        else if(strpos($line, '<span title') !== false)
        {
            $title = explode('title=', $line);
            $title = explode('class=', $title[1]);
            $title = $title[0];
            //print_r($title);
        }
        else if($difline == $line)
        {
            $type = $line;
            //print_r($type);
        }
    }
}