MediaWiki + Graphviz +图像地图+页面链接

时间:2012-08-26 03:19:38

标签: php xpath mediawiki graphviz

背景:在WAMP堆栈上使用MediaWiki 1.19.1,Graphviz 2.28.0,扩展:GraphViz 0.9(Server 2008,Apache 2.4.2,MySQL 5.5.27,PHP 5.4.5) 。对于使用MediaWiki中的GraphViz扩展从Graphviz图表渲染可点击图像的基本功能,一切都非常有效。

问题:图片地图中的链接未添加到MediaWiki pagelinks表中。我明白为什么他们没有被添加,但如果没有办法通过“这里有什么链接”功能来关注链接,那就成了一个问题。

所需解决方案:在GraphViz扩展程序中处理图表时,我想使用生成的.map文件,然后在页面上创建要添加的wikilink列表以便获取通过MediaWiki并添加到pagelinks表。

详情: 此GraphViz扩展代码:

<graphviz border='frame' format='png'>
digraph example1 {
  // define nodes
  nodeHello [
    label="I say Hello", 
    URL="Hello"
  ]
  nodeWorld [
    label="You say World!",
    URL="World"
  ]
  // link nodes
  nodeHello -> nodeWorld!
}
</graphviz>

生成此图片:

hello world graphviz diagram

此图像映射代码位于服务器上相应的.map文件中:

<map id="example1" name="example1">
<area shape="poly" id="node1" href="Hello" title="I say Hello" alt="" coords="164,29,161,22,151,15,137,10,118,7,97,5,77,7,58,10,43,15,34,22,31,29,34,37,43,43,58,49,77,52,97,53,118,52,137,49,151,43,161,37"/>
<area shape="poly" id="node2" href="World" title="You say World!" alt="" coords="190,125,186,118,172,111,152,106,126,103,97,101,69,103,43,106,22,111,9,118,5,125,9,133,22,139,43,145,69,148,97,149,126,148,152,145,172,139,186,133"/>
</map>

从该图像映射文件中,我希望能够提取href和title来构建wikilinks,如下所示:

[[Hello|I say Hello]]
[[World|You say World!]]

我猜测,因为.map文件本质上是XML,我可以使用XPATH来查询文件,但这只是猜测。 PHP不是我最强大的领域,我不知道使用XML / XPATH选项的最佳方法,或者这甚至是从文件中提取信息的最佳方法。

一旦我从.map文件中获得了wikilinks的集合/数组,我确信我可以破解GraphViz.php扩展文件,将其添加到页面内容中,以将其添加到pagelinks表中。

进展:在我提交问题时,我有一点Rubber Duck Problem Solving时刻。我意识到,由于我在图像映射中有很好的数据,因此XPATH可能就是这样。能够提取我需要的数据是相当简单的,特别是因为我发现地图文件内容是静止存储在本地字符串变量中。

$xml = new SimpleXMLElement( $map );
foreach($xml->area as $item) {
  $links .= "[[" . $item->attributes()->href . "|" . $item->attributes()->title . "]]";
}

最终解决方案:请参阅下面我接受的答案。 谢谢参观。我感谢您提供的任何帮助或指导。

1 个答案:

答案 0 :(得分:4)

我终于解决了所有问题,现在有一个相当不错的解决方案来很好地渲染图表,提供链接列表,并使用wiki注册链接。我的解决方案并不完全支持当前GraphViz扩展的所有功能,因为它是由我们不需要的功能而编写的,我不想支持。以下是此解决方案的假设/限制:

  • 不支持MscGen:我们只需要Graphviz。
  • 不支持imageAtrributes:我们想控制格式和表示,似乎imageAttributes实现中存在不一致,从而导致进一步的支持问题。
  • 不支持wikilinks:虽然通过wiki和Graphviz扩展提供一致的链接使用会很好,但实际情况是Graphviz是一个完全不同的标记环境。虽然目前的扩展“支持”wikilinks,但实施有点弱,并留下了混乱的区域。示例:Wikilinks支持为链接提供可选描述,但Graphviz已使用节点标签进行描述。那么你最终忽略wikilink描述并告诉用户'是的,我们支持wikilinks但不使用描述部分'因为我们没有真正正确使用wikilinks,只需实现常规链接实现并尽量避免完全混乱。

以下是输出结果: Wiki Graphviz enhancements

以下是所做的更改

注释掉这一行:

// We don't want to support wikilinks so don't replace them
//$timelinesrc = rewriteWikiUrls( $timelinesrc ); // if we use wiki-links we transform them to real urls

替换此代码块:

// clean up map-name
$map  = preg_replace( '#<ma(.*)>#', ' ', $map );
$map  = str_replace( '</map>', '', $map );
if ( $renderer == 'mscgen' ) {
    $mapbefore = $map;
    $map = preg_replace( '/(\w+)\s([_:%#/\w]+)\s(\d+,\d+)\s(\d+,\d+)/',
   '<area shape="$1" href="$2" title="$2" alt="$2" coords="$3,$4" />',
    $map );
}

/* Procduce html
 */
if ( $wgGraphVizSettings->imageFormatting )
{
    $txt = imageAtrributes( $args, $storagename, $map, $outputType, $wgUploadPath ); // if we want borders/position/...
} else {
    $txt  = '<map name="' . $storagename . '">' . $map . '</map>' .
         '<img src="' . $wgUploadPath . '/graphviz/' . $storagename . '.' . $outputType . '"' .
                   ' usemap="#' . $storagename . '" />';
}

使用此代码:

$intHtml = '';
$extHtml = '';
$badHtml = '';

// Wrap the map/area info with top level nodes and load into xml object
$xmlObj = new SimpleXMLElement( $map );

// What does map look like before we start working with it?
wfDebugLog( 'graphviz', 'map before: ' . $map . "\n" );

// loop through each of the <area> nodes
foreach($xmlObj->area as $areaNode) {

    wfDebugLog( 'graphviz', "areaNode: " . $areaNode->asXML() . "\n" );

    // Get the data from the XML attributes
    $hrefValue = (string)$areaNode->attributes()->href;
    $textValue = (string)$areaNode->attributes()->title;

    wfDebugLog( 'graphviz', '$hrefValue before: ' . $hrefValue . "\n" );
    wfDebugLog( 'graphviz', '$textValue before: ' . $textValue . "\n" );

    // For the text fields, multiple spaces ("   ") in the Graphviz source (label)
    // turns into a regular space followed by encoded representations of
    // non-breaking spaces (" &#xA0;&#xA0;") in the .map file which then turns
    // into the following in the local variables: ("   ").
    // The following two options appear to convert/decode the characters
    // appropriately. Leaving the lines commented out for now, as we have
    // not seen a graph in the wild with multiple spaces in the label -
    // just happened to stumble on the scenario.
    // See http://www.php.net/manual/en/simplexmlelement.asxml.php
    // and http://stackoverflow.com/questions/2050723/how-can-i-preg-replace-special-character-like-pret-a-porter
    //$textValue = iconv("UTF-8", "ASCII//TRANSLIT", $textValue);
    //$textValue = html_entity_decode($textValue, ENT_NOQUOTES, 'UTF-8');

    // Now we need to deal with the whitespace characters like tabs and newlines
    // and also deal with them correctly to replace multiple occurences.
    // Unfortunately, the \n and \t values in the variable aren't actually
    // tab or newline characters but literal characters '\' + 't' or '\' + 'n'.
    // So the normally recommended regex '/\s+/u' to replace the whitespace 
    // characters does not work.
    // See http://stackoverflow.com/questions/6579636/preg-replace-n-in-string
    $hrefValue = preg_replace("/( |\\\\n|\\\\t)+/", ' ', $hrefValue);
    $textValue = preg_replace("/( |\\\\n|\\\\t)+/", ' ', $textValue);

    // check to see if the url matches any of the
    // allowed protocols for external links
    if ( preg_match( '/^(?:' . wfUrlProtocols() . ')/', $hrefValue ) ) {
        // external link
        $parser->mOutput->addExternalLink( $hrefValue );
        $extHtml .= Linker::makeExternalLink( $hrefValue, $textValue ) . ', ';
    }
    else {
        $first = substr( $hrefValue, 0, 1 );
        if ( $first == '\\' || $first == '[' || $first == '/' ) {
            // potential UNC path, wikilink, absolute or relative path
            $hrefValue = '#InvalidLink';
            $badHtml .= Linker::makeExternalLink( $hrefValue, $textValue ) . ', ';
            $textValue = 'Invalid link. Check Graphviz source.';
        }
        else {
            $title = Title::newFromText( $hrefValue );
            if ( is_null( $title ) ) {
                // invalid link
                $hrefValue = '#InvalidLink';
                $badHtml .= Linker::makeExternalLink( $hrefValue, $textValue ) . ', ';
                $textValue = 'Invalid link. Check Graphviz source.';
            }
            else {
                // internal link
                $parser->mOutput->addLink( $title );
                $intHtml .= Linker::link( $title, $textValue ) . ', ';
                $hrefValue = $title->getFullURL();
            }
        }
    }

    $areaNode->attributes()->href = $hrefValue;
    $areaNode->attributes()->title = $textValue;

}

$map = $xmlObj->asXML();

// The contents of $map, which is now XML, gets embedded
// in the HTML sent to the browser so we need to strip
// the XML version tag and we also strip the <map> because
// it will get replaced with a new one with the correct name.
$map = str_replace( '<?xml version="1.0"?>', '', $map );
$map = preg_replace( '#<ma(.*)>#', ' ', $map );
$map = str_replace( '</map>', '', $map );

// Let's see what it looks like now that we are done with it.
wfDebugLog( 'graphviz', 'map after: ' . $map . "\n" );

$txt = '' .
    '<table style="background-color:#f9f9f9;border:1px solid #ddd;">' .
        '<tr>' .
            '<td style="border:1px solid #ddd;text-align:center;">' .
                '<map name="' . $storagename . '">' . $map . '</map>' .
                '<img src="' . $wgUploadPath . '/graphviz/' . $storagename . '.' . $outputType . '"' . ' usemap="#' . $storagename . '" />' .
            '</td>' .
        '</tr>' .
        '<tr>' .
            '<td style="font:10px verdana;">' .
                'This Graphviz diagram links to the following pages:' .
                '<br /><strong>Internal</strong>: ' . ( $intHtml != '' ?  rtrim( $intHtml, ' ,' ) : '<em>none</em>' ) .
                '<br /><strong>External</strong>: ' . ( $extHtml != '' ?  rtrim( $extHtml, ' ,' ) : '<em>none</em>' ) .
                ( $badHtml != '' ? '<br /><strong>Invalid</strong>: ' . rtrim($badHtml, ' ,') .
                '<br /><em>Tip: Do not use wikilinks ([]), UNC paths (\\) or relative links (/) when creating links in Graphviz diagrams.</em>' : '' ) .
            '</td>' .
        '</tr>' .
    '</table>';

可能的增强功能:

  • 如果图表下方的链接列表已经过排序和删除,那就太好了。