大家好我有一个如下所示的阵列
Array
(
[0] => http://api.tweetmeme.com/imagebutton.gif?url=http://mashable.com/2010/09/25/trailmeme/
[1] => http://cdn.mashable.com/wp-content/plugins/wp-digg-this/i/gbuzz-feed.png
[2] => http://mashable.com/wp-content/plugins/wp-digg-this/i/fb.jpg
[3] => http://mashable.com/wp-content/plugins/wp-digg-this/i/diggme.png
[4] => http://ec.mashable.com/wp-content/uploads/2009/01/bizspark2.gif
[5] => http://cdn.mashable.com/wp-content/uploads/2010/09/web.png
[6] => http://mashable.com/wp-content/uploads/2010/09/Screen-shot-2010-09-24-at-10.51.26-PM.png
[7] => http://cdn.mashable.com/wp-content/uploads/2009/02/bizspark.jpg
[8] => http://feedads.g.doubleclick.net/~at/lxx00QTjYBaYojpnpnTa6MXUmh4/0/di
[9] =>
[10] => http://feedads.g.doubleclick.net/~at/lxx00QTjYBaYojpnpnTa6MXUmh4/1/di
[11] =>
[12] => http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:D7DqB2pKExk
[13] =>
[14] => http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:V_sGLiPBpWU
[15] =>
[16] => http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:F7zBnMyn0Lo
[17] =>
[18] => http://feeds.feedburner.com/~ff/Mashable?d=qj6IDK7rITs
[19] =>
[20] => http://feeds.feedburner.com/~ff/Mashable?d=_e0tkf89iUM
[21] =>
[22] => http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:gIN9vFwOqvQ
[23] =>
[24] => http://feeds.feedburner.com/~ff/Mashable?d=yIl2AUoC8zA
[25] =>
[26] => http://feeds.feedburner.com/~ff/Mashable?d=P0ZAIrC63Ok
[27] =>
[28] => http://feeds.feedburner.com/~ff/Mashable?d=I9og5sOYxJI
[29] =>
[30] => http://feeds.feedburner.com/~ff/Mashable?d=CC-BsrAYo0A
[31] =>
[32] => http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:_cyp7NeR2Rw
[33] =>
[34] => http://feeds.feedburner.com/~r/Mashable/~4/0N_mvMwPHYk
)
基本上,我想
".jpg,.png,.gif"
个扩展名; "digg,fb,tweet,bizspark"
。尝试过你的代码并返回例如 嗨,香港专业教育学院尝试上面的代码...它返回一个包含我想要的东西的数组。
嗨,我尝试了上面的代码......它返回一个包含我想要的东西的数组。 )
Array ( [5] =>
http://feedads.g.doubleclick.net/~at/W-z_kHMi30EtE1mpxK8NvMmNmeg/0/di
[7] =>
http://feedads.g.doubleclick.net/~at/W-z_kHMi30EtE1mpxK8NvMmNmeg/1/di
[9] =>
http://feeds.feedburner.com/~ff/Mashable?i=mEedXAp78pg:339cIishd6A:D7DqB2pKExk
[11] =>
http://feeds.feedburner.com/~ff/Mashable?i=mEedXAp78pg:339cIishd6A:V_sGLiPBpWU
[13] =>
http://feeds.feedburner.com/~ff/Mashable?i=mEedXAp78pg:339cIishd6A:F7zBnMyn0Lo
[15] =>
http://feeds.feedburner.com/~ff/Mashable?d=qj6IDK7rITs
[17] =>
http://feeds.feedburner.com/~ff/Mashable?d=_e0tkf89iUM
[19] =>
http://feeds.feedburner.com/~ff/Mashable?i=mEedXAp78pg:339cIishd6A:gIN9vFwOqvQ
[21] =>
http://feeds.feedburner.com/~ff/Mashable?d=yIl2AUoC8zA
[23] =>
http://feeds.feedburner.com/~ff/Mashable?d=P0ZAIrC63Ok
[25] =>
http://feeds.feedburner.com/~ff/Mashable?d=I9og5sOYxJI
[27] =>
http://feeds.feedburner.com/~ff/Mashable?d=CC-BsrAYo0A
[29] =>
http://feeds.feedburner.com/~ff/Mashable?i=mEedXAp78pg:339cIishd6A:_cyp7NeR2Rw
[31] =>
http://feeds.feedburner.com/~r/Mashable/~4/mEedXAp78pg
))
)
我希望它能从例如第一个例子中返回
[5] => http://cdn.mashable.com/wp-content/uploads/2010/09/web.png
[6] => http://mashable.com/wp-content/uploads/2010/09/Screen-shot-2010-09-24-at-10.51.26-PM.png
任何想法?
嗨GZIp我已经修改了代码并且我获得了更好的结果
function url_array_filter($url)
{
static $words = array('digg', 'fb', 'tweet', 'bizspark','feedburner','feedads','CountImage');
static $extens = array('.jpg', '.png', '.gif');
$ret = true;
if (!$url) {
$ret = false;
} elseif (str_replace($words, '', $url) != $url) {
$ret = false;
} else {
$path = parse_url($url, PHP_URL_PATH);
if (in_array(substr($path, -4), $extens)) {
$ret = false;
}
}
return $ret;
}
我的问题现在出现了输出。例如
Array ( [0] => http://cdn.dzone.com/images/thumbs/120x90/491551.jpg' style='width:120;height:90;float:left;vertical-align:top;border:1px solid )
Array ( [0] => http://cdn.dzone.com/images/thumbs/120x90/490913.jpg' style='width:120;height:90;float:left;vertical-align:top;border:1px solid )
我只想要网址。我认为我有从原始内容中提取网址的问题。 lemme发布了一个关于原始问题和我正在做什么的链接。
RSS Feeds and image extraction indepth
我只是想要网址。我想从那个链接.... getImagesUrl()可能搞砸了。我将尝试使用parse_url来恢复正确的URL。 lemme知道我是否在正确的轨道上。我非常接近管理从用magpie解析的RSS源提取图像网址
Ok GZip,这是修改和添加到你的代码... 95%的作品!大。 虽然我确实收到了一些有趣的结果,我发布在
下面function url_array_filter($url)
{
static $words = array('digg', 'fb', 'tweet', 'bizspark','feedburner','feedads','CountImage','fuelbrand');
static $extens = array('.jpg', '.png', '.gif');
$ret = true;
if (!$url) {
$ret = false;
} elseif (str_replace($words, '', $url) != $url) {
$ret = false;
} else {
$path = parse_url($url, PHP_URL_PATH);
if (in_array(substr($path, -4), $extens)) {
$ret = false;
}
}
return $ret;
}
function cleanURL($a_url)
{
$ret=array();
foreach ($a_url as $c)
{
$a=parse_url($c, PHP_URL_SCHEME).'://'.parse_url($c, PHP_URL_HOST).parse_url($c, PHP_URL_PATH);
$a=explode("'",$a);
$ret[]=$a[0];
}
return $ret;
}
示例用法。 $这 - > getImagesUrl($ C);下面在第一个问题中返回结果。
foreach($content as $c) {
// get the images in content
$arr = $this->getImagesUrl($c);
$arr = array_filter($arr, 'url_array_filter');
}
$ret=cleanURL($arr);
if (count($ret)>0)
{
print_r($ret);
echo "<br/><br/>";
}
到目前为止,几乎所有事情都很有效但我不断得到一些不好的结果,比如
Array ( [0] => http://cdn.mashable.com/wp-content/uploads/2010/02/ipad-side- )
Array ( [0] => http://mrg.bz/FZtr2k [1] => http://mrg.bz/IDkx4w )
我们差不多的人......任何想法
答案 0 :(得分:6)
使用例如array_filter()将为您提供灵活性和易维护性(更改要求,调试等):
function url_array_filter($url)
{
static $words = array('digg', 'fb', 'tweet', 'bizspark');
static $extens = array('.jpg', '.png', '.gif');
$ret = true;
if (!$url) {
$ret = false;
} elseif (str_replace($words, '', $url) != $url) {
$ret = false;
} else {
$path = parse_url($url, PHP_URL_PATH);
if (in_array(substr($path, -4), $extens)) {
$ret = false;
}
}
return $ret;
}
$arr = array_filter($arr, 'url_array_filter');
print_r($arr);
(适用于给定的数组,但可能需要更改;它是演示代码。)
答案 1 :(得分:3)
foreach ($array as $key => $value) {
if (
empty($value)||
(preg_match('#^http:\/\/(.*)\.(gif|png|jpg)$#i', $value) == 0)||
(preg_match('#(tweet|bizspark)#i', $value) > 0)
) {
unset($array[$key]);
}
}