我试图从一系列字符串中提取数据,但没有运气。 在下面的示例代码中,我尝试使用preg_split,但它没有给我我想要的结果。
使用以下代码:
<?php
$str = '<a href="https://rads.stackoverflow.com/amzn/click/com/B008EYEYBA" rel="nofollow noreferrer">Nike Air Jordan SC-2 Mens Basketball Shoes 454050-035</a><img src="http://www.assoc-amazon.com/e/ir?t=mytwitterpage-20&l=as2&o=1&a=B008EYEYBA" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />
';
$chars = preg_split('/ /', $str, -1, PREG_SPLIT_OFFSET_CAPTURE);
echo '<pre>';
print_r($chars);
echo '<pre>';
?>
给出了结果:
Array
(
[0] => Array
(
[0] => 0
)
[1] => Array
(
[0] => href="https://rads.stackoverflow.com/amzn/click/com/B008EYEYBA" rel="nofollow noreferrer">Nike
[1] => 3
)
[2] => Array
(
[0] => Air
[1] => 167
)
[3] => Array
(
[0] => Jordan
[1] => 171
)
[4] => Array
(
[0] => SC-2
[1] => 178
)
[5] => Array
(
[0] => Mens
[1] => 183
)
[6] => Array
(
[0] => Basketball
[1] => 188
)
[7] => Array
(
[0] => Shoes
[1] => 199
)
[8] => Array
(
[0] => 454050-035 205
)
[9] => Array
(
[0] => src="http://www.assoc-amazon.com/e/ir?t=mytwitterpage-20&l=as2&o=1&a=B008EYEYBA"
[1] => 224
)
[10] => Array
(
[0] => width="1"
[1] => 305
)
[11] => Array
(
[0] => height="1"
[1] => 315
)
[12] => Array
(
[0] => border="0"
[1] => 326
)
[13] => Array
(
[0] => alt=""
[1] => 337
)
[14] => Array
(
[0] => style="border:none
[1] => 344
)
[15] => Array
(
[0] => !important;
[1] => 363
)
[16] => Array
(
[0] => margin:0px
[1] => 375
)
[17] => Array
(
[0] => !important;"
[1] => 386
)
[18] => Array
(
[0] => />
[1] => 399
)
)
请注意,在array1中,当我只需要的时候,包含Nike这个词只是一个URL。
[1] => Array
(
[0] => href="https://rads.stackoverflow.com/amzn/click/com/B008EYEYBA" rel="nofollow noreferrer">Nike
[1] => 3
)
实际上,我提取$ str的最终目的只是将源URL和achor文本输出到一个单独的数组中,如下所示:
URL:
锚文:
Nike Air Jordan SC-2男士篮球鞋454050-035
任何想法如何能够实现这一点非常感谢。
答案 0 :(得分:0)
你可以在php函数的帮助下完成这个。
您想在此处删除锚标记。
您可以使用strip_tags()函数删除所有标记。
答案 1 :(得分:0)
使用常规expressoin来解析html是一种不好的做法。 PHP有DOM扩展名。你根本无法构建一个通用正则表达式,它适用于你可能遇到的任何html。 DOM方法更加可扩展。
$string = '<a href="http://rads.stackoverflow.com/amzn/click/B008EYEYBA">Nike Air Jordan SC-2 Mens Basketball Shoes 454050-035</a><img src="http://www.assoc-amazon.com/e/ir?t=mytwitterpage-20&l=as2&o=1&a=B008EYEYBA" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />';
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($string);
libxml_clear_errors();
$elementA = $dom->getElementsByTagName('a')->item(0);
$aText = $elementA->nodeValue;
$aLink = $elementA->getAttribute('href');
echo $aLink . "\n" . $aText;