Python Requests返回head但没有body

时间:2017-11-30 19:30:04

标签: python-requests

所以我把几段代码放在一起,这样我就可以分析一个感兴趣的页面的html。但是,我没有获得与正常访问我的浏览器时获得的相同的HTML

我的意思是,在bowser中我得到<body>标签的所有标签和内容,如<h1>等等,result.text给了我一个没有任何东西的身体我在浏览器的控制台中运行的decodeURIComponent,但是没有给出页面的其余部分,只有像结果一样的json

OBS¹:我尝试过显示编码,发送RefererUser-Agent标头并在会话中运行它们。什么都没有用

OBS²:顺便说一句,这是网站:http://www.danielfischer.com/

这是我收到的页面,在<body>标记上没有注意到任何内容,该标记应该有很多块引用:

<!DOCTYPE html>\n
<html>
   \n
   <head>
      \n  
      <link rel="stylesheet" type="text/css" class="__meteor-css__" href="/3645e6749a7bb15e2b7a2b598d31f70b37ebf857.css?meteor_css_resource=true">
      \n
      <title>Daniel Fischer / Leader, Developer, Designer - San Francisco & Los Angeles</title>
      \n\n
   </head>
   \n
   <body>
      \n\n\n\n<script type="text/javascript">__meteor_runtime_config__ = JSON.parse(decodeURIComponent("%7B%22meteorRelease%22%3A%22METEOR%401.3%22%2C%22meteorEnv%22%3A%7B%22NODE_ENV%22%3A%22production%22%2C%22TEST_METADATA%22%3A%22%7B%7D%22%7D%2C%22PUBLIC_SETTINGS%22%3A%7B%7D%2C%22ROOT_URL%22%3A%22http%3A%2F%2Fwww.danielfischer.com%22%2C%22ROOT_URL_PATH_PREFIX%22%3A%22%22%2C%22appId%22%3A%22l92idpwahzxm4o1rd1%22%2C%22autoupdateVersion%22%3A%22bc9eac9e135921bb6593bdd334fe6855bedbb06e%22%2C%22autoupdateVersionRefreshable%22%3A%2237e5fc255eafc269ecb1fa482090c66aa8d627cc%22%2C%22autoupdateVersionCordova%22%3A%22none%22%7D"));</script>\n\n  <script type="text/javascript" src="/5dc2d6c014e5a058f745b57cca61e0d242cf06b7.js?meteor_js_resource=true"></script>\n\n\n
   </body>
   \n
</html>
\n'

这是我的Python代码,请忽略评论的行,我只是让他们在那里展示我的另一个修复尝试:

import requests, bs4

url = 'http://www.danielfischer.com/'

with requests.Session() as s:
    page = s.get(url, headers={"Referer": "https://www.facebook.com/", 'User-Agent':'test'})
    print(page.text.encode('utf-8'))

#page.encoding = 'utf-8'
#pageParsed = bs4.BeautifulSoup(pageRaw, "html.parser")
#outfile = open(path, 'w')
#outfile.write(str(page.text))

#print(pageRaw.text)

任何见解都将受到赞赏:)

0 个答案:

没有答案