如何从丰富的代码段元素中排除内容?

时间:2013-10-17 05:14:53

标签: microdata schema.org rich-snippets

我正在尝试按照http://schema.org/Article标准将丰富的代码段数据应用到我的网页。其中一个属性是articleBody,我希望它应该包含构成该文章的整个文本。

不幸的是,文章的HTML表示会偶尔出现按钮,广告和其他提示,其中的文字不应该放入articleBody

例如:

<div itemscope itemtype="http://schema.org/Article">
  <div itemtype="articleBody">
    <p>1st Paragraph</p>
    <p>2nd paragraph</p>
    <a>A few useful links for my users</a>
    <p>3rd paragraph</p>
    <div>A few text ads</div>
    <p>4th paragraph</p>
  </div>
</div>

有没有办法从文章本身中排除广告/链接中的文字?

1 个答案:

答案 0 :(得分:1)

不,Microdata没有提供排除内容的方法。

articleBody的{​​{3}}。


丑陋的“黑客”将指定此项目的多个articleBody属性:

<div itemscope itemtype="http://schema.org/Article">
  <div itemtype="articleBody">
    <p>1st Paragraph</p>
    <p>2nd paragraph</p>
  </div>
    <a>A few useful links for my users</a>
    <p itemtype="articleBody">3rd paragraph</p>
    <div>A few text ads</div>
    <p itemtype="articleBody">4th paragraph</p>
  </div>
</div>

但请注意value will be the textContent of the element,所以这取决于消费者。


另一个丑陋的方法:

复制Microdata does not define how those values should be interpreted中包含的信息:

<div itemscope itemtype="http://schema.org/Article">
  <div>
    <p>1st Paragraph</p>
    <p>2nd paragraph</p>
    <a>A few useful links for my users</a>
    <p>3rd paragraph</p>
    <div>A few text ads</div>
    <p>4th paragraph</p>
  </div>
  <meta itemtype="articleBody" content="1st Paragraph. 2nd paragraph. 3rd paragraph. 4th paragraph." />
</div>