Wikipedia使用“HTML站点地图”链接到每个内容页面。大量的页面必须分成许多组,以便每个页面最多都有一个。当然有100个链接。
维基百科就是这样做的:
整个文章列表分为几个较大的组,每个组由它们的第一个和最后一个字来定义:
当您单击一个类别时,同样划分此范围(例如“地球”到“哀悼”)。重复该过程直到当前范围仅包括大约100篇文章,以便他们可以显示。
我非常喜欢这种链接列表的方法,可以最大限度地减少到达任何文章所需的点击次数。
如何自动创建此类文章列表?
所以我的问题是如何自动创建这样一个索引页面,允许点击较小的类别,直到包含的文章数量足够小才能显示它们。
想象一下,给出了所有文章名称的数组,您将如何开始使用自动类别拆分编写索引?
Array('AAA rating', 'abdicate', ..., 'zero', 'zoo')
如果你能帮助我,那就太好了。当然,我不需要一个完美的解决方案,而是一个有用的方法。非常感谢你提前!
编辑:现在在维基百科的软件(MediaWiki)中找到该部分:
<?php
/**
* Implements Special:Allpages
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
* http://www.gnu.org/copyleft/gpl.html
*
* @file
* @ingroup SpecialPage
*/
/**
* Implements Special:Allpages
*
* @ingroup SpecialPage
*/
class SpecialAllpages extends IncludableSpecialPage {
/**
* Maximum number of pages to show on single subpage.
*/
protected $maxPerPage = 345;
/**
* Maximum number of pages to show on single index subpage.
*/
protected $maxLineCount = 100;
/**
* Maximum number of chars to show for an entry.
*/
protected $maxPageLength = 70;
/**
* Determines, which message describes the input field 'nsfrom'.
*/
protected $nsfromMsg = 'allpagesfrom';
function __construct( $name = 'Allpages' ){
parent::__construct( $name );
}
/**
* Entry point : initialise variables and call subfunctions.
*
* @param $par String: becomes "FOO" when called like Special:Allpages/FOO (default NULL)
*/
function execute( $par ) {
global $wgRequest, $wgOut, $wgContLang;
$this->setHeaders();
$this->outputHeader();
$wgOut->allowClickjacking();
# GET values
$from = $wgRequest->getVal( 'from', null );
$to = $wgRequest->getVal( 'to', null );
$namespace = $wgRequest->getInt( 'namespace' );
$namespaces = $wgContLang->getNamespaces();
$wgOut->setPagetitle(
( $namespace > 0 && in_array( $namespace, array_keys( $namespaces) ) ) ?
wfMsg( 'allinnamespace', str_replace( '_', ' ', $namespaces[$namespace] ) ) :
wfMsg( 'allarticles' )
);
if( isset($par) ) {
$this->showChunk( $namespace, $par, $to );
} elseif( isset($from) && !isset($to) ) {
$this->showChunk( $namespace, $from, $to );
} else {
$this->showToplevel( $namespace, $from, $to );
}
}
/**
* HTML for the top form
*
* @param $namespace Integer: a namespace constant (default NS_MAIN).
* @param $from String: dbKey we are starting listing at.
* @param $to String: dbKey we are ending listing at.
*/
function namespaceForm( $namespace = NS_MAIN, $from = '', $to = '' ) {
global $wgScript;
$t = $this->getTitle();
$out = Xml::openElement( 'div', array( 'class' => 'namespaceoptions' ) );
$out .= Xml::openElement( 'form', array( 'method' => 'get', 'action' => $wgScript ) );
$out .= Html::hidden( 'title', $t->getPrefixedText() );
$out .= Xml::openElement( 'fieldset' );
$out .= Xml::element( 'legend', null, wfMsg( 'allpages' ) );
$out .= Xml::openElement( 'table', array( 'id' => 'nsselect', 'class' => 'allpages' ) );
$out .= "<tr>
<td class='mw-label'>" .
Xml::label( wfMsg( 'allpagesfrom' ), 'nsfrom' ) .
" </td>
<td class='mw-input'>" .
Xml::input( 'from', 30, str_replace('_',' ',$from), array( 'id' => 'nsfrom' ) ) .
" </td>
</tr>
<tr>
<td class='mw-label'>" .
Xml::label( wfMsg( 'allpagesto' ), 'nsto' ) .
" </td>
<td class='mw-input'>" .
Xml::input( 'to', 30, str_replace('_',' ',$to), array( 'id' => 'nsto' ) ) .
" </td>
</tr>
<tr>
<td class='mw-label'>" .
Xml::label( wfMsg( 'namespace' ), 'namespace' ) .
" </td>
<td class='mw-input'>" .
Xml::namespaceSelector( $namespace, null ) . ' ' .
Xml::submitButton( wfMsg( 'allpagessubmit' ) ) .
" </td>
</tr>";
$out .= Xml::closeElement( 'table' );
$out .= Xml::closeElement( 'fieldset' );
$out .= Xml::closeElement( 'form' );
$out .= Xml::closeElement( 'div' );
return $out;
}
/**
* @param $namespace Integer (default NS_MAIN)
* @param $from String: list all pages from this name
* @param $to String: list all pages to this name
*/
function showToplevel( $namespace = NS_MAIN, $from = '', $to = '' ) {
global $wgOut;
# TODO: Either make this *much* faster or cache the title index points
# in the querycache table.
$dbr = wfGetDB( DB_SLAVE );
$out = "";
$where = array( 'page_namespace' => $namespace );
$from = Title::makeTitleSafe( $namespace, $from );
$to = Title::makeTitleSafe( $namespace, $to );
$from = ( $from && $from->isLocal() ) ? $from->getDBkey() : null;
$to = ( $to && $to->isLocal() ) ? $to->getDBkey() : null;
if( isset($from) )
$where[] = 'page_title >= '.$dbr->addQuotes( $from );
if( isset($to) )
$where[] = 'page_title <= '.$dbr->addQuotes( $to );
global $wgMemc;
$key = wfMemcKey( 'allpages', 'ns', $namespace, $from, $to );
$lines = $wgMemc->get( $key );
$count = $dbr->estimateRowCount( 'page', '*', $where, __METHOD__ );
$maxPerSubpage = intval($count/$this->maxLineCount);
$maxPerSubpage = max($maxPerSubpage,$this->maxPerPage);
if( !is_array( $lines ) ) {
$options = array( 'LIMIT' => 1 );
$options['ORDER BY'] = 'page_title ASC';
$firstTitle = $dbr->selectField( 'page', 'page_title', $where, __METHOD__, $options );
$lastTitle = $firstTitle;
# This array is going to hold the page_titles in order.
$lines = array( $firstTitle );
# If we are going to show n rows, we need n+1 queries to find the relevant titles.
$done = false;
while( !$done ) {
// Fetch the last title of this chunk and the first of the next
$chunk = ( $lastTitle === false )
? array()
: array( 'page_title >= ' . $dbr->addQuotes( $lastTitle ) );
$res = $dbr->select( 'page', /* FROM */
'page_title', /* WHAT */
array_merge($where,$chunk),
__METHOD__,
array ('LIMIT' => 2, 'OFFSET' => $maxPerSubpage - 1, 'ORDER BY' => 'page_title ASC')
);
$s = $dbr->fetchObject( $res );
if( $s ) {
array_push( $lines, $s->page_title );
} else {
// Final chunk, but ended prematurely. Go back and find the end.
$endTitle = $dbr->selectField( 'page', 'MAX(page_title)',
array_merge($where,$chunk),
__METHOD__ );
array_push( $lines, $endTitle );
$done = true;
}
$s = $res->fetchObject();
if( $s ) {
array_push( $lines, $s->page_title );
$lastTitle = $s->page_title;
} else {
// This was a final chunk and ended exactly at the limit.
// Rare but convenient!
$done = true;
}
$res->free();
}
$wgMemc->add( $key, $lines, 3600 );
}
// If there are only two or less sections, don't even display them.
// Instead, display the first section directly.
if( count( $lines ) <= 2 ) {
if( !empty($lines) ) {
$this->showChunk( $namespace, $from, $to );
} else {
$wgOut->addHTML( $this->namespaceForm( $namespace, $from, $to ) );
}
return;
}
# At this point, $lines should contain an even number of elements.
$out .= Xml::openElement( 'table', array( 'class' => 'allpageslist' ) );
while( count ( $lines ) > 0 ) {
$inpoint = array_shift( $lines );
$outpoint = array_shift( $lines );
$out .= $this->showline( $inpoint, $outpoint, $namespace );
}
$out .= Xml::closeElement( 'table' );
$nsForm = $this->namespaceForm( $namespace, $from, $to );
# Is there more?
if( $this->including() ) {
$out2 = '';
} else {
if( isset($from) || isset($to) ) {
global $wgUser;
$out2 = Xml::openElement( 'table', array( 'class' => 'mw-allpages-table-form' ) ).
'<tr>
<td>' .
$nsForm .
'</td>
<td class="mw-allpages-nav">' .
$wgUser->getSkin()->link( $this->getTitle(), wfMsgHtml ( 'allpages' ),
array(), array(), 'known' ) .
"</td>
</tr>" .
Xml::closeElement( 'table' );
} else {
$out2 = $nsForm;
}
}
$wgOut->addHTML( $out2 . $out );
}
/**
* Show a line of "ABC to DEF" ranges of articles
*
* @param $inpoint String: lower limit of pagenames
* @param $outpoint String: upper limit of pagenames
* @param $namespace Integer (Default NS_MAIN)
*/
function showline( $inpoint, $outpoint, $namespace = NS_MAIN ) {
global $wgContLang;
$inpointf = htmlspecialchars( str_replace( '_', ' ', $inpoint ) );
$outpointf = htmlspecialchars( str_replace( '_', ' ', $outpoint ) );
// Don't let the length runaway
$inpointf = $wgContLang->truncate( $inpointf, $this->maxPageLength );
$outpointf = $wgContLang->truncate( $outpointf, $this->maxPageLength );
$queryparams = $namespace ? "namespace=$namespace&" : '';
$special = $this->getTitle();
$link = $special->escapeLocalUrl( $queryparams . 'from=' . urlencode($inpoint) . '&to=' . urlencode($outpoint) );
$out = wfMsgHtml( 'alphaindexline',
"<a href=\"$link\">$inpointf</a></td><td>",
"</td><td><a href=\"$link\">$outpointf</a>"
);
return '<tr><td class="mw-allpages-alphaindexline">' . $out . '</td></tr>';
}
/**
* @param $namespace Integer (Default NS_MAIN)
* @param $from String: list all pages from this name (default FALSE)
* @param $to String: list all pages to this name (default FALSE)
*/
function showChunk( $namespace = NS_MAIN, $from = false, $to = false ) {
global $wgOut, $wgUser, $wgContLang, $wgLang;
$sk = $wgUser->getSkin();
$fromList = $this->getNamespaceKeyAndText($namespace, $from);
$toList = $this->getNamespaceKeyAndText( $namespace, $to );
$namespaces = $wgContLang->getNamespaces();
$n = 0;
if ( !$fromList || !$toList ) {
$out = wfMsgWikiHtml( 'allpagesbadtitle' );
} elseif ( !in_array( $namespace, array_keys( $namespaces ) ) ) {
// Show errormessage and reset to NS_MAIN
$out = wfMsgExt( 'allpages-bad-ns', array( 'parseinline' ), $namespace );
$namespace = NS_MAIN;
} else {
list( $namespace, $fromKey, $from ) = $fromList;
list( , $toKey, $to ) = $toList;
$dbr = wfGetDB( DB_SLAVE );
$conds = array(
'page_namespace' => $namespace,
'page_title >= ' . $dbr->addQuotes( $fromKey )
);
if( $toKey !== "" ) {
$conds[] = 'page_title <= ' . $dbr->addQuotes( $toKey );
}
$res = $dbr->select( 'page',
array( 'page_namespace', 'page_title', 'page_is_redirect' ),
$conds,
__METHOD__,
array(
'ORDER BY' => 'page_title',
'LIMIT' => $this->maxPerPage + 1,
'USE INDEX' => 'name_title',
)
);
if( $res->numRows() > 0 ) {
$out = Xml::openElement( 'table', array( 'class' => 'mw-allpages-table-chunk' ) );
while( ( $n < $this->maxPerPage ) && ( $s = $res->fetchObject() ) ) {
$t = Title::makeTitle( $s->page_namespace, $s->page_title );
if( $t ) {
$link = ( $s->page_is_redirect ? '<div class="allpagesredirect">' : '' ) .
$sk->linkKnown( $t, htmlspecialchars( $t->getText() ) ) .
($s->page_is_redirect ? '</div>' : '' );
} else {
$link = '[[' . htmlspecialchars( $s->page_title ) . ']]';
}
if( $n % 3 == 0 ) {
$out .= '<tr>';
}
$out .= "<td style=\"width:33%\">$link</td>";
$n++;
if( $n % 3 == 0 ) {
$out .= "</tr>\n";
}
}
if( ($n % 3) != 0 ) {
$out .= "</tr>\n";
}
$out .= Xml::closeElement( 'table' );
} else {
$out = '';
}
}
if ( $this->including() ) {
$out2 = '';
} else {
if( $from == '' ) {
// First chunk; no previous link.
$prevTitle = null;
} else {
# Get the last title from previous chunk
$dbr = wfGetDB( DB_SLAVE );
$res_prev = $dbr->select(
'page',
'page_title',
array( 'page_namespace' => $namespace, 'page_title < '.$dbr->addQuotes($from) ),
__METHOD__,
array( 'ORDER BY' => 'page_title DESC',
'LIMIT' => $this->maxPerPage, 'OFFSET' => ($this->maxPerPage - 1 )
)
);
# Get first title of previous complete chunk
if( $dbr->numrows( $res_prev ) >= $this->maxPerPage ) {
$pt = $dbr->fetchObject( $res_prev );
$prevTitle = Title::makeTitle( $namespace, $pt->page_title );
} else {
# The previous chunk is not complete, need to link to the very first title
# available in the database
$options = array( 'LIMIT' => 1 );
if ( ! $dbr->implicitOrderby() ) {
$options['ORDER BY'] = 'page_title';
}
$reallyFirstPage_title = $dbr->selectField( 'page', 'page_title',
array( 'page_namespace' => $namespace ), __METHOD__, $options );
# Show the previous link if it s not the current requested chunk
if( $from != $reallyFirstPage_title ) {
$prevTitle = Title::makeTitle( $namespace, $reallyFirstPage_title );
} else {
$prevTitle = null;
}
}
}
$self = $this->getTitle();
$nsForm = $this->namespaceForm( $namespace, $from, $to );
$out2 = Xml::openElement( 'table', array( 'class' => 'mw-allpages-table-form' ) ).
'<tr>
<td>' .
$nsForm .
'</td>
<td class="mw-allpages-nav">' .
$sk->link( $self, wfMsgHtml ( 'allpages' ), array(), array(), 'known' );
# Do we put a previous link ?
if( isset( $prevTitle ) && $pt = $prevTitle->getText() ) {
$query = array( 'from' => $prevTitle->getText() );
if( $namespace )
$query['namespace'] = $namespace;
$prevLink = $sk->linkKnown(
$self,
htmlspecialchars( wfMsg( 'prevpage', $pt ) ),
array(),
$query
);
$out2 = $wgLang->pipeList( array( $out2, $prevLink ) );
}
if( $n == $this->maxPerPage && $s = $res->fetchObject() ) {
# $s is the first link of the next chunk
$t = Title::MakeTitle($namespace, $s->page_title);
$query = array( 'from' => $t->getText() );
if( $namespace )
$query['namespace'] = $namespace;
$nextLink = $sk->linkKnown(
$self,
htmlspecialchars( wfMsg( 'nextpage', $t->getText() ) ),
array(),
$query
);
$out2 = $wgLang->pipeList( array( $out2, $nextLink ) );
}
$out2 .= "</td></tr></table>";
}
$wgOut->addHTML( $out2 . $out );
if( isset($prevLink) or isset($nextLink) ) {
$wgOut->addHTML( '<hr /><p class="mw-allpages-nav">' );
if( isset( $prevLink ) ) {
$wgOut->addHTML( $prevLink );
}
if( isset( $prevLink ) && isset( $nextLink ) ) {
$wgOut->addHTML( wfMsgExt( 'pipe-separator' , 'escapenoentities' ) );
}
if( isset( $nextLink ) ) {
$wgOut->addHTML( $nextLink );
}
$wgOut->addHTML( '</p>' );
}
}
/**
* @param $ns Integer: the namespace of the article
* @param $text String: the name of the article
* @return array( int namespace, string dbkey, string pagename ) or NULL on error
* @static (sort of)
* @access private
*/
function getNamespaceKeyAndText($ns, $text) {
if ( $text == '' )
return array( $ns, '', '' ); # shortcut for common case
$t = Title::makeTitleSafe($ns, $text);
if ( $t && $t->isLocal() ) {
return array( $t->getNamespace(), $t->getDBkey(), $t->getText() );
} else if ( $t ) {
return null;
}
# try again, in case the problem was an empty pagename
$text = preg_replace('/(#|$)/', 'X$1', $text);
$t = Title::makeTitleSafe($ns, $text);
if ( $t && $t->isLocal() ) {
return array( $t->getNamespace(), '', '' );
} else {
return null;
}
}
}
答案 0 :(得分:2)
这不是一个好方法,因为当你到达列表末尾时没有办法停止。如果项目数超过最大值,您只想分割项目(尽管您可能希望在那里增加一些灵活性,因为您可以进入页面上有两个项目的阶段。)
我假设数据集实际上来自数据库,但使用$ items数组以便于显示
最简单的,假设它来自发送起始和结束索引号的网页,并且您已检查这些数字是否有效且已消毒
$itemsPerPage = 50; // constant
$itemStep = ($end - $start) / $itemsPerPage;
if($itemStep < 1)
{
for($i = $start; $i < $end; $i++)
{
// display these as individual items
display_link($items[$i]);
}
}
else
{
for($i = $start; $i < $end; $i += $itemStep)
{
$to = $i + ($itemStep - 1); // find the end part
if($to > $end)
$to = $end;
display_to_from($items[$i], $items[$to]);
}
}
显示功能显示所需的链接。但是,这样做的一个问题是你可能想要调整每页的项目,因为你冒着一套(比如说)51并最终得到一个从1到49的链接的风险,另外50个到51。
我不明白你为什么要在你的伪代码中将它安排在一个组中,因为你要从一个页面到另一个页面进行进一步的排序,所以你只需要每个部分的开头和结尾,直到你到达页面的位置。所有链接都适合。
- 编辑
原来是错的。现在,您要根据要显示的最大项目划分必须经过的项目数量。如果它是1000,这将列出20个项目,如果它是100,000,那么每2,000个项目。如果它小于您显示的数量,您可以单独显示它们。
- 再次编辑 - 添加有关数据库的更多信息
不,你是对的,你不想加载2,000,000个数据记录,而你不需要。 您有两个选项,您可以制作一个准备好的声明,例如“select * from article where article =?”并在结果中循环获取一个,或者如果你想在一个块中进行 - 假设有一个mysql数据库和上面的代码,
$numberArray = "";
for($i = $start; $i < $end; $i += $itemStep)
{
$to = $i + ($itemStep - 1); // find the end part
if($to > $end)
$to = $end;
// display_to_from($items[$i], $items[$to]);
if( $i != $start)
$numberArray += ", ";
$numberArray.= $i.", ".$to;
}
$sqlQuery = "Select * from articles where article_id in (".$numberArray.")";
... do the mysql select and go through the results, using alternate rows as the start and end
这会为您提供一个查询,例如'选择*来自文章,其中article_id in(1,49,50,99,100,149 ... etc)'
作为普通集的过程
答案 1 :(得分:0)
我在伪代码中的方法:
$items = array('air', 'automatic', 'ball', ..., 'yield', 'zero', 'zoo');
$itemCount = count($items);
$itemsPerPage = 50; // constant
$counter = 0;
foreach ($items as $item) {
$groupNumber = floor($counter/$itemsPerPage);
// assign $item to group $groupNumber
$counter++;
}
// repeat this procedure recursively for each of the new groups
你认为这是一个好方法吗?你能改进或完善它吗?