请问xdmp:http-get()是否支持代理?

时间:2014-11-17 02:37:39

标签: proxy marklogic

xdmp:http-get($url,
    <options xmlns="xdmp:document-get">
        <format>binary</format>
    </options>)[2] 

大家好,

上述查询不会从代理服务器返回响应。我知道在代理服务器上获取响应的IP地址和端口号。有谁知道在哪里添加IP和端口号?

MarkLogic版本:7.x

最近,我尝试使用以下代码配置http://markmail.org/message/sbfj44jtmpsyopyh中讨论的代理。

let $proxy := "http://171.21.15.60:3128"
let $uri := "http://www.austlii.edu.au/cgi-bin/sinosrch.cgi?results=200;query=damage"
let $host := tokenize($uri,'/')[3] 
let $proxyuri := concat($proxy, '/', tokenize($uri, '/')[last()]) 
return xdmp:http-post(
  $proxyuri,
  <options xmlns="xdmp:http">
    <headers>
      <Host>{$host}</Host>          
    </headers>
  </options>
)

但作为回应,我收到了一个错误的请求。

<response xmlns="xdmp:http">
 <code>400</code>
 <message>Bad Request</message>
 <headers>
  <server>squid/3.1.4</server>
  <mime-version>1.0</mime-version>
  <date>Thu, 20 Nov 2014 04:09:50 GMT</date>
  <content-type>text/html</content-type>
  <content-length>3071</content-length>
  <x-squid-error>ERR_INVALID_URL 0</x-squid-error>
  <vary>Accept-Language</vary>
  <content-language>en</content-language>
  <x-cache>MISS from l076ddms1</x-cache>
  <x-cache-lookup>NONE from l076ddms1:3128</x-cache-lookup>
  <via>1.0 l076ddms1 (squid/3.1.4)</via>
  <proxy-connection>close</proxy-connection>
 </headers>
</response>

查看以下错误

  

尝试检索网址时遇到以下错误:/sinosrch.cgi?results=200;query=damage

任何人都可以帮我解决这个问题吗?

感谢。

大家好,

在我尝试了@mblakele讲述的步骤之后,我仍然得到了相同的回应。

declare namespace http="xdmp:http";

declare function local:http-options(
  $options as element(http:options)?,
  $extra as element(http:options)?)
as element()?
{
  if (empty($extra)) then $options
  else if (empty($options)) then $extra
  else element http:options {
    (: TODO - needs to handle conflicting options. :)
    $options/*,
    $extra/* }
};

declare function local:http-get(
  $proxy as xs:string,
  $uri as xs:string,
  $options as element(http:options)?)
 as item()+
{

  let $uri-toks := tokenize($uri, '/+')
  let $uri-host := $uri-toks[2]
  let $options := local:http-options(
    $options,
    element http:options {
      element http:headers {
        element http:host { $uri-host } } })
  (: TODO support TLS proxies using https. :)
  let $uri-proxy := concat(
    'http://', $proxy,
    substring-after($uri, $uri-host))
  return xdmp:http-get($uri-proxy, $options)
};

local:http-get(
  '171.21.15.60:3128', 'http://www.austlii.edu.au/cgi-bin/sinosrch.cgi?results=200;query=damage', ())

上述代码的$ uri-proxy的值:

http://171.21.15.60:3128/cgi-bin/sinosrch.cgi?results=200;query=damage

上述代码的$ uri-host值为:

www.austlii.edu.au

上述代码中$ options的值为:

<http:options xmlns:http="xdmp:http">    
 <http:headers>    
 <http:host>www.austlii.edu.au</http:host>
</http:headers></http:options>

错误是

  

尝试检索网址时遇到以下错误:/cgi-bin/sinosrch.cgi?results=200;query=damage

2 个答案:

答案 0 :(得分:1)

我认为没有任何直接支持,但http://markmail.org/message/sbfj44jtmpsyopyh可能有所帮助。

[编辑]由于该代码有一些问题,这里只是一个简单的重写。这仍然不是完全通用的,但调试和增强可能更容易。

declare namespace http="xdmp:http" ;

declare function local:http-options(
  $options as element(http:options)?,
  $extra as element(http:options)?)
as element()?
{
  if (empty($extra)) then $options
  else if (empty($options)) then $extra
  else element http:options {
    (: TODO - needs to handle conflicting options. :)
    $options/*,
    $extra/* }
};

declare function local:http-get(
  $proxy as xs:string,
  $uri as xs:string,
  $options as element(http:options)?)
 as item()+
{
  let $_ := (
    if (matches($proxy, '^\w+(:\d+)?$')) then ()
    else error(
      (), 'BADPROXY',
      ('Must be a string host:port', xdmp:describe($proxy))))
  let $uri-toks := tokenize($uri, '/+')
  let $uri-host := $uri-toks[2]
  let $options := local:http-options(
    $options,
    element http:options {
      element http:headers {
        element http:host { $uri-host } } })
  (: TODO support TLS proxies using https. :)
  let $uri-proxy := concat(
    'http://', $proxy,
    substring-after($uri, $uri-host))
  return xdmp:http-get($uri-proxy, $options)
};

local:http-get(
  'localhost:8118', 'http://www.google.com/', ())

答案 1 :(得分:0)

因为我无法使用MarkLogic处理代理。我使用 .NET 开发了 REST API ,通过代理隧道访问外部网站,我让MarkLogic来调用我的本地Web服务。

希望MarkLogic http-get()将来支持代理。

感谢dev的宝贵建议。