我正在创建元搜索引擎,但我被卡住了!使用php我向3个搜索引擎发送查询并从每个搜索引擎中拉出前10个网址。然后我将这些网址存储在一个二维数组中,并带有相应的分数用于聚合目的,即。第一个结果得到20分,第二个得到18分等。
所以在下面的示例中,我使用'php'查询搜索引擎并获得以下结果:
块引用
Blekko的
数组([url] => php.about.com/ [得分] => 20)数组([url] => php.net/ [得分] => 18)数组([url] => en.wikipedia.org/wiki/PHP [score] => 16)数组([url] => www.phpbuilder.com/ [得分] => 14)数组([url] => blekko.com/ws/http://php.about.com/+/seo [score] => 12)阵列 ([url] => www.w3schools.com/php/default.asp [score] => 10) 数组([url] => phpnuke.org/ [得分] => 8)数组([url] => www.symfony-project.org/ [score] => 6)数组([url] => www.phpconference.co.uk/ [得分] => 4)
Entireweb
数组([url] => phpnuke.org/ [得分] => 20)数组([url] => www.aardvarktopsitesphp.com/ [得分] => 18)数组([url] => www.php.net/ [得分] => 16)数组([url] => www.php.net/downloads.php [score] => 14)数组([url] => php.net/manual [score] => 12)数组([url] => www.php.net/manual/en/ [score] => 10)数组([url] => www.php.net/docs.php [score] => 8)数组([url] => www.php.net/license/ [score] => 6)数组([url] => www.phplinkdirectory.com/ [得分] => 4)
冰
数组([url] => www.php.net/ [得分] => 20)数组([url] => en.wikipedia.org/wiki/PHP [score] => 18)数组([url] => www.php.net/downloads.php [score] => 16)数组([url] => www.w3schools.com/php/default.asp [score] => 14)阵列( [url] => windows.php.net/download [score] => 12)阵列( [url] => windows.php.net/ [得分] => 10)数组([url] => www.tizag.com/phpT/ [score] => 8)数组([url] => wiki.php.net/ [得分] => 6)数组([url] => qa.php.net/ [得分] => 4)数组([url] => www.php.com/ [得分] => 2)
我想要做的是结合所有这些结果,删除重复 网址但添加分数并使用聚合创建新列表 结果可能类似于:
数组([url] => www.php.net/ [得分] => 54)
数组([url] => en.wikipedia.org/wiki/PHP [得分] => 34)
数组([url] => www.w3schools.com/php/default.asp [得分] => 24)
等
我只是在寻找实现这一目标的最有效方法,我们非常感谢任何建议。感谢
答案 0 :(得分:0)
1-您可以trim
之后发现www.php.net
和php.net
同一网站(www.php.net
和php.net/downloads.php
相同) )。
2-为 Bing 返回结果提供更多积分。你知道 Bing 是最具语义搜索能力的。
3-您可以捕获标题并将其保存到阵列,这是个人推荐。