第二个HTTPoison.get请求返回404

时间:2019-01-11 08:21:33

标签: elixir elixir-poison

我正在测试对多个URL的多个请求,并进行一些Web抓取-在第一个请求之后,第二个经常失败。我不知道为什么:

我向网站提出了两个简单的请求,发生的是第二个请求返回了与Google相关的响应,但失败了。如果我启动服务器并仅点击Yahoo,则请求将按预期返回。如果我的第一个请求进入Wikipedia,随后的请求又移至其他地方,也会发生相同的行为。

有人可以解释发生了什么事吗?

谢谢。

单位: {:httpoison,“〜> 1.5”}

首先我启动服务器(根据文档)

iex(1)> HTTPoison.start
{:ok, []}

接下来,我请求获取Google主页:

iex(2)> HTTPoison.get "https://www.google.com"
{:ok,
 %HTTPoison.Response{
   body: "<!doctype html><html itemscope=\"\" itemtype=\"http://schema.org/WebPage\" lang=\"en\"><head><meta content=\"Search the world's information, including webpages, images, videos and more. Google has many special features to help you find exactly what you're looking for.\" name=\"description\"><meta content=\"noodp\" name=\"robots\"><meta content=\"text/html; charset=UTF-8\" http-equiv=\"Content-Type\"><meta content=\"/logos/doodles/2019/celebrating-earl-scruggs-5680695065182208.3-law.gif\" itemprop=\"image\"><meta content=\"Celebrating Earl Scruggs\" property=\"twitter:title\"><meta content=\"Celebrating Earl Scruggs! #GoogleDoodle\" property=\"twitter:description\"><meta content=\"Celebrating Earl Scruggs! #GoogleDoodle\" property=\"og:description\"><meta content=\"summary_large_image\" property=\"twitter:card\"><meta content=\"@GoogleDoodles\" property=\"twitter:site\"><meta content=\"https://www.google.com/logos/doodles/2019/celebrating-earl-scruggs-5680695065182208-2xa.gif\" property=\"twitter:image\"><meta content=\"https://www.google.com/logos/doodles/2019/celebrating-earl-scruggs-5680695065182208-2xa.gif\" property=\"og:image\"><meta content=\"1000\" property=\"og:image:width\"><meta content=\"400\" property=\"og:image:height\"><meta content=\"https://www.google.com/logos/doodles/2019/celebrating-earl-scruggs-5680695065182208-2xa.gif\" property=\"og:url\"><meta content=\"video.other\" property=\"og:type\"><title>Google</title><script nonce=\"j0aPHCuRPUlftRzX2g6tTQ==\">(function(){window.google={kEI:'D1A4XKnVOo-6_wTgtpOgDA',kEXPI:'0,1353747,57,50,1907,1017,625,781,698,527,731,325,1124,349,30,1227,806,95,546,352,2335328,167,32,68,329226,1294,12383,4855,32692,2074,13173,867,10761,1402,6381,854,2481,2,2,6801,364,1165,7,2147,1262,4243,224,1017,1195,266,3742,1365,575,835,284,2,579,727,2069,363,58,2,1,3,933,364,4324,3397,302,658,610,291,482,2115,135,1407,1413,1529,395,525,621,5,2,2,1963,528,2067,182,283,2838,298,670,1044,1,468,1344,386,743,268,81,7,1,2,27,461,620,29,983,6,406,458,466,2,1379,769,536,428,267,2552,1739,313,876,412,2,554,2368,2,264,381,286,948,11,1209,38,363,557,270,303,145,155,499,285,433,42,1322,99,342,43,47,1080,543,1826,367,789,270,603,661,431,49,626,265,217,779,1531,35,2,4,2,670,44,226,1292,3,237,9,12,408,349,167,82,247,879,238,410,529,187,508,105,1,1496,5,12,620,464,87,99,25,178,283,278,6,38,53,290,390,37,117,9,81,345,103,17,112,7,203,173,81,2,83,340,14,617,604,58,351,614,175,97,1,1,2,177,803,60,264,88,5968727,2554,233,22,5997346,90,2800095,4,1572,549,332,445,1,2,80,1,900,583,4,309,1,8,1,2,2132,1,1,1,1,1,414,1,748,141,59,726,3,7,443,3,117,1,2,140,226,23,53,22306694',authuser:0,kscs:'c9c918f0_D1A4XKnVOo-6_wTgtpOgDA',kGL:'US'};google.kHL='en';})();google.time=function(){return(new Date).getTime()};(function(){google.lc=[];google.li=0;google.getEI=function(a){for(var b;a&&(!a.getAttribute||!(b=a.getAttribute(\"eid\")));)a=a.parentNode;return b||google.kEI};google.getLEI=function(a){for(var b=null;a&&(!a.getAttribute||!(b=a.getAttribute(\"leid\")));)a=a.parentNode;return b};google.https=function(){return\"https:\"==window.location.protocol};google.ml=function(){return null};google.log=function(a,b,e,c,g){if(a=google.logUrl(a,b,e,c,g)){b=new Image;var d=google.lc,f=google.li;d[f]=b;b.onerror=b.onload=b.onabort=function(){delete d[f]};google.vel&&google.vel.lu&&google.vel.lu(a);b.src=a;google.li=f+1}};google.logUrl=function(a,b,e,c,g){var d=\"\",f=google.ls||\"\";e||-1!=b.search(\"&ei=\")||(d=\"&ei=\"+google.getEI(c),-1==b.search(\"&lei=\")&&(c=google.getLEI(c))&&(d+=\"&lei=\"+c));c=\"\";!e&&google.cshid&&-1==b.search(\"&cshid=\")&&\"slh\"!=a&&(c=\"&cshid=\"+google.cshid);a=e||\"/\"+(g||\"gen_204\")+\"?atyp=i&ct=\"+a+\"&cad=\"+b+d+f+\"&zx=\"+google.time()+c;/^http:/i.test(a)&&google.https()&&(google.ml(Error(\"a\"),!1,{src:a,glmm:1}),a=\"\");return a};}).call(this);(function(){google.y={};google.x=function(a,b){if(a)var c=a.id;else{do c=Math.random();while(google.y[c])}google.y[c]=[a,b];return!1};google.lm=[];google.plm=function(a){google.lm.push.apply(google.lm,a)};google.lq=[];google.load=function(a,b,c){google.lq.push([[a],b,c])};google.loadAll=function(a,b){google.lq.push([a,b])};}).call(this);google.f={};</scri" <> ...,
   headers: [
     {"Date", "Fri, 11 Jan 2019 08:13:03 GMT"},
     {"Expires", "-1"},
     {"Cache-Control", "private, max-age=0"},
     {"Content-Type", "text/html; charset=ISO-8859-1"},
     {"P3P", "CP=\"This is not a P3P policy! See g.co/p3phelp for more info.\""},
     {"Server", "gws"},
     {"X-XSS-Protection", "1; mode=block"},
     {"X-Frame-Options", "SAMEORIGIN"},
     {"Set-Cookie",
      "1P_JAR=2019-01-11-08; expires=Sun, 10-Feb-2019 08:13:03 GMT; path=/; domain=.google.com"},
     {"Set-Cookie",
      "NID=154=eRdDgOkW7gEdW7vRAPVM1Q7p3GKbBPOSH3yr07CL414Lmx740Jtk9WTPtl9RbGzWJ4QCetWtoQIjSbv_F-ML6Bs6_I9tt91ED_TD8ZKQrenqMr9ykhB7oBd8XoN7W5TqWNTy5jdlEjPFjwkAL42qTrjgGR2MJ5_jTphwwzVCKS8; expires=Sat, 13-Jul-2019 08:13:03 GMT; path=/; domain=.google.com; HttpOnly"},
     {"Alt-Svc", "quic=\":443\"; ma=2592000; v=\"44,43,39,35\""},
     {"Accept-Ranges", "none"},
     {"Vary", "Accept-Encoding"},
     {"Transfer-Encoding", "chunked"}
   ],
   request: %HTTPoison.Request{
     body: "",
     headers: [],
     method: :get,
     options: [],
     params: %{},
     url: "https://www.google.com"
   },
   request_url: "https://www.google.com",
   status_code: 200
 }}

最后,我请求获得Yahoo的主页

iex(3)> HTTPoison.get "https://www.yahoo.com"
{:ok,
 %HTTPoison.Response{
   body: "<!DOCTYPE html>\n<html lang=en>\n  <meta charset=utf-8>\n  <meta name=viewport content=\"initial-scale=1, minimum-scale=1, width=device-width\">\n  <title>Error 404 (Not Found)!!1</title>\n  <style>\n    *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}\n  </style>\n  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\n  <p><b>404.</b> <ins>That’s an error.</ins>\n  <p>The requested URL <code>/</code> was not found on this server.  <ins>That’s all we know.</ins>\n",
   headers: [
     {"Content-Type", "text/html; charset=UTF-8"},
     {"Referrer-Policy", "no-referrer"},
     {"Content-Length", "1561"},
     {"Date", "Fri, 11 Jan 2019 08:13:27 GMT"},
     {"Alt-Svc", "quic=\":443\"; ma=2592000; v=\"44,43,39,35\""}
   ],
   request: %HTTPoison.Request{
     body: "",
     headers: [],
     method: :get,
     options: [],
     params: %{},
     url: "https://www.yahoo.com"
   },
   request_url: "https://www.yahoo.com",
   status_code: 404
 }}

0 个答案:

没有答案