Question

按照此处的启动源中提供的示例：https://github.com/scrapinghub/splash/blob/master/splash/examples/render-multiple.lua

在那个lua脚本中，返回了一个lua表而不是一个json对象。

使用scrapy-splash时，如何使用lua脚本返回并检索数组/列表而不是表/词典？

Answer 1

如果您正在使用scrapy-splash，则解码结果可用作response.data（请参阅https://github.com/scrapy-plugins/scrapy-splash#responses）。您应该执行以下操作来访问google.com的PNG数据：

import base64
# ...
     def parse_result(self, response):
         img = base64.b64decode(response.data["www.google.com"])
         # ...

链接脚本返回{"<url>": "<base64 png data>"}映射，而不是数组。

如果要返回数组，请修改脚本以使用整数键和treat.as_array：

treat = require('treat')
function main(splash, args)
  splash.set_viewport_size(800, 600)
  splash.set_user_agent('Splash bot')
  local example_urls = {"www.google.com", "www.bbc.co.uk", "scrapinghub.com"}
  local urls = args.urls or example_urls
  local results = {}
  for i, url in ipairs(urls) do
    local ok, reason = splash:go("http://" .. url)
    if ok then
      splash:wait(0.2)
      results[i] = splash:png()
    end
  end
  return treat.as_array(results)
end

然后你可以访问这样的数据：

import base64
# ...
     def parse_result(self, response):
         img = base64.b64decode(response.data[0])
         # ...

如果返回列表，如何从splash中检索？

1 个答案: