使用PHP cURL提交Google Search Appliance(GSA)内容Feed - 400错误

时间:2014-09-24 16:00:49

标签: php curl google-search-appliance

因此,我尝试为我的网络应用程序开发一个模块,该模块使用PHP cURL将内容Feed推送到我们的Google Search Appliance(GSA),以便通过端口19900将数据作为POST信息传输到设备。基于我在用于创建和向GSA提交订阅源的文档中读到的所有内容,这应该没有问题,但服务器返回时出现以下(令人难以置信的模糊且无用)错误:

  
      
  1. 这是一个错误。
  2.         

    您的客户发出了格式错误或非法的请求。这就是我们所知道的。

我一直在与我们姐妹网站上帮助检测GSA的架构师进行故障排除,我们无法确定导致问题的原因。根据我们的IT部门的说法,所有端口都已打开以进行此通信(如果它们已关闭,我们将无法收到错误消息),并且我们已验证发送服务器的IP地址被列为'允许'在GSA。毋庸置疑,我们感到难过。

以下是传输XML Feed的代码:

<?php
$target_url = 'http://gsadomain.com:19900/xmlfeed';

$header = array('Content-Type: multipart/form-data');

$fields = array(
    'feedtype'=>'incremental',
    'datasource'=>'datasourcename',
    'data'=>'@'.realpath('gsa_feed.xml')
);

$ch = curl_init();

curl_setopt($ch, CURLOPT_USERPWD, "gsaadmin:gsaadminpassword");
curl_setopt($ch, CURLOPT_HTTPHEADER,$header);
curl_setopt($ch, CURLOPT_TIMEOUT,120);
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_POST,1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($fields));

$return = curl_exec($ch);

if (curl_errno($ch)) {
    $msg = curl_error($ch);
}

curl_close ($ch);

echo $return;
?>

以下是我们尝试提交的XML:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE gsafeed PUBLIC "-//Google//DTD GSA Feeds//EN" "http://this.is.the.ip/gsafeed.dtd">
<gsafeed>
    <header>
        <datasource>datasource</datasource>
        <feedtype>incremental</feedtype>
    </header>
    <group>
        <record url="http://website.com/mod/view.php?id=15903" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">customers</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">partners</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
            </acl>
            <metadata>
                <meta name="type" content="module" />
                <meta name="id" content="1" />
                <meta name="name" content="Module for Everyone" />
                <meta name="course_id" content="655" />
            </metadata>
            <content>
                This is the description of the Module for Everyone.
        </content>
        </record>
        <record url="http://website.com/mod/view.php?id=15904" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">partners</principal>
            </acl>
            <metadata>
                <meta name="type" content="module" />
                <meta name="id" content="2" />
                <meta name="name" content="Module for Partners" />
                <meta name="course_id" content="655" />
            </metadata>
            <content>
                This is the description of the Module for Partners.
        </content>
        </record>
        <record url="http://website.com/mod/view.php?id=15905" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
            </acl>
            <metadata>
                <meta name="type" content="module" />
                <meta name="id" content="3" />
                <meta name="name" content="Module for Employees" />
                <meta name="course_id" content="655" />
            </metadata>
            <content>
                This is the description of the Module for Employees.
        </content>
        </record>
        <record url="http://website.com/course/view.php?id=655#section-1" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">customers</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">partners</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
            </acl>
            <metadata>
                <meta name="type" content="topic" />
                <meta name="id" content="1" />
                <meta name="name" content="Course Topic for Everyone" />
                <meta name="course_id" content="655" />
            </metadata>
            <content>
                This is the description of the Course Topic for All Audiences.
        </content>
        </record>
        <record url="http://website.com/course/view.php?id=655#section-2" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">partners</principal>
            </acl>
            <metadata>
                <meta name="type" content="topic" />
                <meta name="id" content="2" />
                <meta name="name" content="Course Topic for Partners" />
                <meta name="course_id" content="655" />
            </metadata>
            <content>
                This is the description of the Course Topic for Partners.
        </content>
        </record>
        <record url="http://website.com/course/view.php?id=655#section-3" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
            </acl>
            <metadata>
                <meta name="type" content="topic" />
                <meta name="id" content="3" />
                <meta name="name" content="Course Topic for Employees" />
                <meta name="course_id" content="655" />
            </metadata>
            <content>
                This is the description of the Course Topic for Employees.
        </content>
        </record>
        <record url="http://website.com/course/view.php?id=655" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">customers</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">partners</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
            </acl>
            <metadata>
                <meta name="type" content="course" />
                <meta name="id" content="655" />
                <meta name="name" content="Course for Everyone" />
                <meta name="course_id" content="655" />
            </metadata>
            <content>
                This is the description of the Course for Everyone.
        </content>
        </record>
        <record url="http://website.com/course/view.php?id=656" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">partners</principal>
            </acl>
            <metadata>
                <meta name="type" content="course" />
                <meta name="id" content="656" />
                <meta name="name" content="Course for Partners" />
                <meta name="course_id" content="656" />
            </metadata>
            <content>
                This is the description of the Course for Partners.
        </content>
        </record>
        <record url="http://website.com/course/view.php?id=657" mimetype="text/html" last-modified="Thu, 14 Aug 2014 18:53:00 GMT">
            <acl inheritance-type="and-both-permit">
                <principal scope="group" access="permit" namespace="Default" case-sensitivity-type="everything-case-insensitive">employees</principal>
            </acl>
            <metadata>
                <meta name="type" content="course" />
                <meta name="id" content="657" />
                <meta name="name" content="Course for Employees" />
                <meta name="course_id" content="657" />
            </metadata>
            <content>
                This is the description of the Course for Employees.
        </content>
        </record>
    </group>
</gsafeed>

根据我们所看到的一切,这应该有用,但我们遇到了一堵砖墙。有没有人有任何想法?

作为补充说明,由于设置了我们尝试编入索引的页面的方式,让设备抓取页面将无法正常工作(有太多的互动元素,以及我读过的所有内容)表明GSA无法正确地对其进行索引。

编辑1:正如Mark在回复中所建议的,这里是GSA Feed开发人员指南的链接:http://www.google.com/support/enterprise/static/gsa/docs/admin/72/gsa_doc_set/feedsguide/feedsguide.html

编辑2:成功!请参阅下面的答案。关键是让cURL处理$ fields数组的编码,并传递文件内容,而不仅仅是文件路径。

2 个答案:

答案 0 :(得分:1)

所以,长话短说,我能够正确地提交提要,这一切都与cURL处理数据的方式有关。我和我正在努力尝试提交订阅源的工程师都没有使用PHP的cURL插件,也没有如何让GSA接受输入字段。部分谢谢this question by Kenanswer by ThiefMasternext answer from Czechnology以及Mike的帮助,我想出了以下代码:

<?php
$target_url = 'http://gsadomain.com:19900/xmlfeed';

$header = array('Content-Type: multipart/form-data');

$fields = array(
    'feedtype'=>'incremental',
    'datasource'=>'datasourcename',
    'data'=>file_get_contents(realpath('gsa_feed.xml'))
);

$ch = curl_init();

curl_setopt($ch, CURLOPT_USERPWD, "gsaadmin:gsaadminpassword");
curl_setopt($ch, CURLOPT_HTTPHEADER,$header);
curl_setopt($ch, CURLOPT_TIMEOUT,120);
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_POST,1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_POSTFIELDS, $fields);

$return = curl_exec($ch);

if (curl_errno($ch)) {
    $msg = curl_error($ch);
}

curl_close ($ch);

echo $return;
?>

困难在于http_build_query()方法,该方法试图为它执行cURL的工作,并且没有正确设置POST数据的边界。

我们稍后在XML中的某些字段遇到了一些困难,但这主要是因为我们忘了str_replace&符号,单引号和双引号。一旦这些被处理掉,XML就被正确解析了,我们就把所有东西都运行了。

答案 1 :(得分:0)

我不熟悉GSA API,但它似乎并没有发送任何XML数据。您要为data参数发送的字符串值就像@/path/to/gsa_feed.xml一样。我想你实际上需要POST XML吗?

也许更像是

$fields = array(
    'feedtype'=>'incremental',
    'datasource'=>'datasourcename',
    'data'=> file_get_contents(realpath('gsa_feed.xml'))
);