我的代码中缺少什么才能获取网站的html源代码(归功于@Michal Kottman)? 就像右键单击并单击"查看页面源"在铬。
local curl = require "luacurl"
local c = curl.new()
function GET(url)
c:setopt(curl.OPT_URL, url)
c:setopt(curl.OPT_PROXY, "http://myproxy.bla.com:8080")
c:setopt(curl.OPT_HTTPHEADER, "Connection: Keep-Alive", "Accept-Language: en-us")
c:setopt(curl.OPT_CONNECTTIMEOUT, 30 )
local t = {} -- this will collect resulting chunks
c:setopt(curl.OPT_WRITEFUNCTION, function (param, buf)
table.insert(t, buf) -- store a chunk of data received
return #buf
end)
c:setopt(curl.OPT_PROGRESSFUNCTION, function(param, dltotal, dlnow)
print('%', url, dltotal, dlnow) -- do your fancy reporting here
end)
c:setopt(curl.OPT_NOPROGRESS, false) -- use this to activate progress
assert(c:perform())
return table.concat(t) -- return the whole data as a string
end
--local s = GET 'http://www.lua.org/'
local s = GET 'https://www.youtube.com/watch?v=dT_fkwX4fRM'
print(s)
file = io.open("text.html", "wb")
file:write(s)
file:close()
不幸的是,它必须使用Lua并使用lucururl绑定libcurl作为luasocket它在提供代理时不起作用(至少对我而言)。
我下载的文件是空的。使用cmd我得到页面源没有问题
curl http://mypage.com
它适用于lua.org,但对于youtube链接却没有。我错过了什么?
答案 0 :(得分:1)
local curl = require "luacurl"
local c = curl.new()
function GET(url)
c:setopt(curl.OPT_URL, url)
c:setopt(curl.OPT_PROXY, "http://myproxy.com:8080")
c:setopt(curl.OPT_HTTPHEADER, "Connection: Keep-Alive", "Accept-Language: en-us")
c:setopt(curl.OPT_CONNECTTIMEOUT, 30 )
c:setopt(curl.OPT_FOLLOWLOCATION, true) -- REALLY IMPORTANT ELSE FAIL
c:setopt(curl.OPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36")
c:setopt(curl.OPT_SSL_VERIFYPEER, false) -- REALLY IMPORTANT ELSE NOTHING HAPPENS -.-
c:setopt(curl.OPT_ENCODING, "utf8") -- could be important
local t = {} -- this will collect resulting chunks
c:setopt(curl.OPT_WRITEFUNCTION, function (param, buf)
table.insert(t, buf) -- store a chunk of data received
return #buf
end)
c:setopt(curl.OPT_PROGRESSFUNCTION, function(param, dltotal, dlnow)
print('%', url, dltotal, dlnow) -- do your fancy reporting here
end)
c:setopt(curl.OPT_NOPROGRESS, false) -- use this to activate progress
assert(c:perform())
return table.concat(t) -- return the whole data as a string
end