如何在Google Search Appliance中为Feed添加索引?

时间:2013-09-24 10:51:34

标签: django google-search google-search-appliance

我在这个网址.../continent/search?view=atom上有我的原子(大陆列表为xml),如下所示:

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/">
    <title>List of all continents</title>
    <opensearch:totalResults>{{ continents_length }}</opensearch:totalResults>
    <opensearch:startIndex>{{ continents.start_index }}</opensearch:startIndex>
    <opensearch:itemsPerPage>{{ count }}</opensearch:itemsPerPage>
    <opensearch:Query continent="request" searchTerms="" startPage="{{ continents.start_index }}" />
    <author><name>My_site</name></author>
    <id>urn:domain-id:mysite.com:continent</id>

    <link rel="self" href="{{ url }}" />
    {% for continent in continents %}
    <entry>
        <span class="continent_id">{{ continent.continent_id }}</span>
        <span class="continent_name">{{ continent.continent_name }}</span>
        <span class="list_countries">{{ continent.list_countries }}</span>
    </entry>
    {% endfor %}
</feed>

当我想在gsa-interface中对我的Feed进行PUT和索引时,我使用了这个:

<?xml version="1.0" encoding="utf-8"?>
<entry xmlns="http://www.w3.org/2005/Atom">
    <title>continent</title>
    <id>urn:domain-id:mysite.com:continent</id>
    <author>
        <name>admin user</name>
    </author>
    <link rel="self" href=".../feed/continent"/>
    <content type="xhtml">
        <div xmlns="http://www.w3.org/1999/xhtml">
            <span id="refresh-each">15 12,14,18 * * *</span>
            <span id="gsa-datasource">continent</span>
            <span id="gsa-feedtype">full</span>
            <span id="url">...continent/search?view=atom</span>
            <span id="opensearch-pattern">&amp;count=100&amp;startPage=%STARTPAGE%</span>
            <ul class="connection">
                <li id="userid">user</li>
                <li id="password">pass</li>
            </ul>
            <ul id="metadata">
                <li id="continent_id">atom:entry/xhtml:span[@class='continent_id']</li>
                <li id="continent_name">atom:entry/xhtml:span[@class='continent_name']</li>
                <li id="list_countries">atom:entry/xhtml:span[@class='list_countries']</li>
            </ul>
            <div id="xsl-content">
                <![CDATA[
                    <xsl:stylesheet version="1.0"
                        xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                        xmlns:atom="http://www.w3.org/2005/Atom"
                        xmlns:xhtml="http://www.w3.org/1999/xhtml"
                        exclude-result-prefixes="atom xhtml">

                        <xsl:template name="FormatDescription">
                            <xsl:param name="name"/>
                            <xsl:value-of select="$name"/>
                        </xsl:template>

                        <xsl:template match="atom:entry">
                            <html>
                                <body>
                                    <xsl:apply-templates select="atom:entry/xhtml:span" />
                                </body>
                            </html>
                        </xsl:template>
                        <xsl:template match="atom:entry/xhtml:span">
                            <xsl:copy-of select="*"/>
                        </xsl:template>

                    </xsl:stylesheet>
                ]]>
            </div>
        </div>
    </content>
</entry>

但是当我检查传输文件的流量时,它返回0文件,错误:

ProcessNode: Missing required attribute url. skipping element., skipping record

对于第二个索引和第三个索引,没有错误,也没有文件!

200 OK Feed continent has been pushed successfully to the Google Search Appliance.

有任何建议/推荐吗?

2 个答案:

答案 0 :(得分:2)

对此进行跟进,“饲料”这个词在这里令人困惑。 GSA内容Feed与任何RSS或Atom源都不同。这是一个简化的例子(来自文档):

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE gsafeed PUBLIC "-//Google//DTD GSA Feeds//EN" "">
<gsafeed>
   <header>
     <datasource>hello</datasource>
     <feedtype>incremental</feedtype>
   </header>
  <record url="http://www.corp.enterprise.com/hello02" mimetype="text/plain">
    <content>UPDATED - This is hello02</content>
  </record>
</group>
</gsafeed>

如您所见,这是一种非常具体的XML格式,不与网站更新Feed格式共享。这方面的文档很好:http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/feedsguide/feedsguide.html

答案 1 :(得分:1)

查看feed documentation

您需要将GSA传递给Feed XML文件,而不是原子Feed。