如何在站点地图中包含™之类的内容

时间:2019-05-24 15:26:45

标签: php xml sitemap

我正在尝试使用php生成站点地图,但由于我的某些产品名称中包含“&trade”,因此出现了错误。

我知道&需要逃到&amp,但我不确定该如何处理&trade。这是一个很难找到的问题,我敢肯定有人会提出来,但是我找不到相关的东西。

@foreach(var t in ViewBag.s)
{
     string s="data:image/png;base64,"+ Convert.ToBase64String(t.images, 0, t.images.Length);
    <img src="@s" width="150" height="100" />

}

对于标题中带有&trade的任何产品,这都是我得到的错误。

XML解析错误:未定义实体

以下是生成的导致错误的输出示例。

// Remove Whitespace from Links
function url_safe ($data) {
    $data = preg_replace('/\s/', '-', htmlentities($data));
    return $data;       
}

//URLs for Products
$query = "SELECT product_id, product_name FROM product WHERE active = 'Y'";
$result = mysqli_query($dbc, $query) or die(mysqli_error($dbc) . '<br />Query: ' . $query);

while($row = mysqli_fetch_array($result)) {
    $data .= "\t<url>\n";
    $data .= "\t\t<loc>https://www.example.com/product.php?pid=$row[0]&amp;name=" . url_safe($row[1]) . "</loc>\n";
    $data .= "\t\t<changefreq>monthly</changefreq>\n";
    $data .= "\t\t<priority>1.0</priority>\n";
    $data .= "\t</url>\n";
    $i++;
}

2 个答案:

答案 0 :(得分:2)

XML仅支持&trade;之类的命名实体(X)HTML拥有它们。 (或其他定义它们的基于XML的格式。)

这是特殊字符的两种解决方案。您可以将XML定义为UTF-8并直接使用字符,也可以使用数字实体。

以下是DOM的示例:

$document = new DOMDocument('1.0', 'UTF-8');
$document
    ->appendChild($document->createElement('foo'))
    ->textContent = '™';
echo $document->saveXML();    

$document = new DOMDocument('1.0', 'ASCII');
$document
    ->appendChild($document->createElement('foo'))
    ->textContent = '™';
echo $document->saveXML();

输出:

<?xml version="1.0" encoding="UTF-8"?> 
<foo>™</foo> 

<?xml version="1.0" encoding="ASCII"?> 
<foo>&#8482;</foo>

您会看到,在UTF-8编码的XML中,它使用字符,而在ASCII编码中,它将字符编码为数字实体。

您的示例有些不同,因为您将变量放入URL的查询字符串中。因此,您必须先对此进行编码,然后再对XML文本节点的URL进行编码。编码URL变量的函数是urlencode()rawurlencode()。我喜欢使用sprintf()来提高可读性。这是构建URL的示例:

$data = [
    [1, 'foo'],
    [2, 'foo ™'],
    [3, 'foo & bar'],
];

foreach ($data as $item) {
    $url = sprintf(
        'https://www.example.com/product.php?pid=%s&name=%s',
        urlencode($item[0]), 
        urlencode($item[1])
    );
    echo $url, "\n"; 
}

输出:

https://www.example.com/product.php?pid=1&name=foo 
https://www.example.com/product.php?pid=2&name=foo+%E2%84%A2 
https://www.example.com/product.php?pid=3&name=foo+%26+bar

您正在将XML创建为TEXT,但是PHP正是为此工作实现了XMLWriter。使用API​​会照顾到XML中具有特殊含义的字符,例如用于分隔URL参数的&

$data = [
    [1, 'foo'],
    [2, 'foo ™'],
    [3, 'foo & bar'],
];

$writer = new XMLWriter();
$writer->openURI('php://stdout');

$writer->setIndent(1);
$writer->setIndentString("\t");
$writer->startDocument();
$writer->startElementNS(NULL, 'urlset', 'http://www.sitemaps.org/schemas/sitemap/0.9');

foreach ($data as $item) {
  $writer->startElement('url');
  $writer->writeElement(
        'loc', 
        sprintf(
            'https://www.example.com/product.php?pid=%s&name=%s',
            urlencode($item[0]), 
            urlencode($item[1])
        )
  );
  $writer->writeElement('changefreq', 'monthly');
  $writer->writeElement('priority', '1.0');
  $writer->endElement();
}

$writer->endElement();
$writer->endDocument();

输出:

<?xml version="1.0"?> 
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> 
  <url>
    <loc>https://www.example.com/product.php?pid=1&amp;name=foo</loc>
    <changefreq>monthly</changefreq> 
    <priority>1.0</priority> 
  </url> 
  <url> 
    <loc>https://www.example.com/product.php?pid=2&amp;name=foo+%E2%84%A2</loc> 
    <changefreq>monthly</changefreq> 
    <priority>1.0</priority> 
  </url> 
  <url> 
    <loc>https://www.example.com/product.php?pid=3&amp;name=foo+%26+bar</loc> 
    <changefreq>monthly</changefreq> 
    <priority>1.0</priority> 
  </url> 
</urlset>

答案 1 :(得分:1)

您正在寻找urlencode

  

当编码要在URL的查询部分中使用的字符串时,此功能很方便,这是将变量传递到下一页的便捷方法。

维护大部分原始代码,结果应如下所示:

// Remove Whitespace from Links
function url_safe ($data) {
    $data = preg_replace('/\s/', '-', htmlentities($data));

    // Adding url encoding
    $data = urlencode($data);

    return $data;       
}

//URLs for Products
$query = "SELECT product_id, product_name FROM product WHERE active = 'Y'";
$result = mysqli_query($dbc, $query) or die(mysqli_error($dbc) . '<br />Query: ' . $query);

while($row = mysqli_fetch_array($result)) {
    $data .= "\t<url>\n";
    $data .= "\t\t<loc>https://www.example.com/product.php?pid=$row[0]&amp;name=" . url_safe($row[1]) . "</loc>\n";
    $data .= "\t\t<changefreq>monthly</changefreq>\n";
    $data .= "\t\t<priority>1.0</priority>\n";
    $data .= "\t</url>\n";
    $i++;
}

有关更多信息,请参见https://www.php.net/manual/en/function.urlencode.php