Python urrlib user_agent字符串要使用什么?

时间:2015-11-05 20:26:23

标签: python html urllib

我尝试使用带有Python 3.4的urllib从网站上读取HTML并遇到问题。

我正在尝试下载一个与意大利语动词" essere"结合的页面。我可以访问两个来源:wordreference.com和verbix.com。

使用此代码,我可以从wordreference.com成功获取html:

url = 'http://www.wordreference.com//conj//ItVerbs.aspx?v=essere'
user_agent = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)'
values = {'name' : 'John',
          'location' : 'USA',
          'language' : 'Python' }
headers = { 'User-Agent' : user_agent}

data  = urllib.parse.urlencode(values)
data = data.encode('utf-8')
req = urllib.request.Request(url, data, headers)
with urllib.request.urlopen(req) as response:
   verbHTMLStr = response.read()
   print(verbHTMLStr)

如果我将访问Verbix.com网站的URL更改为

url = 'http://www.verbix.com//webverbix//Italian//essere.html'

返回的html是www.verbix.com/languages

两个URL字符串在复制到浏览器的地址栏时都会返回预期的页面。

在我看来,Verbix网站希望看到其他东西为user_agent,但我无法弄清楚它想要什么。我尝试了许多不同的user_agent字符串,并且都返回了相同的错误页面。

1 个答案:

答案 0 :(得分:0)

对我来说,下面是工作!

import urllib

res=urllib.urlopen('http://www.verbix.com//webverbix//Italian//essere.html').read()
print res

打印 -

<!doctype html>
<html lang="en">
<!-- #BeginTemplate "/Templates/verbtable_pure.dwt" -->
<!-- DW6 -->
<head>
<title>
Italian
verb
essere
conjugated in all tenses.</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="keywords" content="Language,verb,Italian,essere,conjugation,conjugate">
<meta name="description" content="Italian verb essere conjugated in all tenses.">
<meta name="author" content="Verbix">
<meta name="google" value="notranslate">
<link rel="stylesheet" href="/system/pure/pure-min.css">
<!--[if lte IE 8]>
        <link rel="stylesheet" href="/combo/1.18.13?/css/layouts/side-menu-old-ie.css">
    <![endif]-->
<!--[if gt IE 8]><!-->
<link rel="stylesheet" href="/system/misc-pure/side-menu-verb.css">
<!--<![endif]-->
<!--[if lt IE 9]>
        <script src="http://cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7/html5shiv.js"></script>
    <![endif]-->
<!--[if lte IE 8]>
    <link rel="stylesheet" href="/system/pure/grids-responsive-old-ie-min.css">
<![endif]-->
<!--[if gt IE 8]><!-->
<link rel="stylesheet" href="/system/pure/grids-responsive-min.css">
<!--<![endif]-->
<meta name="viewport" content="width=device-width, initial-scale=1">
<script type="text/javascript">

  var _gaq = _gaq || [];
  _gaq.push(['_setAccount', 'UA-61929-7']);
  _gaq.push(['_trackPageview']);

  (function() {
    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
  })();

</script>
<!-- Begin Cookie Consent plugin by Silktide - http://silktide.com/cookieconsent -->
<script type="text/javascript">
    window.cookieconsent_options = {"message":"We use cookies to personalize content and ads to users, providing features for social media and analyze our traffic. We will forward information about your use of our website to social media and advertising and research companies that we work with.","dismiss":"Got it!","learnMore":"More info","link":"http://www.verbix.com/webverbix/termsofuse.html","theme":"dark-top"};
</script>
<script type="text/javascript" src="//s3.amazonaws.com/cc.silktide.com/cookieconsent.latest.min.js"></script>
<!-- End Cookie Consent plugin -->
<!-- #BeginEditable "Head" --><!-- #EndEditable -->

</head>


<body>


<div id="layout"> 
  <!-- Menu toggle --> 
  <a href="#menu" id="menuLink" class="menu-link"> 
  <!-- Hamburger icon --> 
  <span></span> </a> 
  <div id="menu"> <a href="/"><img src="/system/html5/top_left.png"/> </a> 
    <div class="pure-menu"> <a class="pure-menu-heading" href="/languages">Online</a> 
      <ul class="pure-menu-list"> 
        <li class="pure-menu-item"><a href="/languages" class="pure-menu-link">Verb Conjugator</a></li> 
        <li class="pure-menu-item"><a href="/translate/" class="pure-menu-link">Verb Translation</a></li> 
        <li class="pure-menu-item"><a href="/find-verb/" class="pure-menu-link">Find Verb</a></li> 
        <li class="pure-menu-item"><a href="/games/" class="pure-menu-link">Games</a></li> 
        <li class="pure-menu-item"><a href="/maps/" class="pure-menu-link">Language Maps</a></li> 
      </ul> 
      <a class="pure-menu-heading" href="/windowsverbix/">Windows</a> 
      <ul class="pure-menu-list"> 
        <li class="pure-menu-item"><a href="/windowsverbix/" class="pure-menu-link">Verbix for Windows</a></li> 
        <li class="pure-menu-item"><a href="/download/" class="pure-menu-link">Download</a></li> 
        <li class="pure-menu-item"><a href="/store/" class="pure-menu-link">Store</a></li> 
      </ul> 
      <a class="pure-menu-heading" href="/wizard/">For Webmasters</a> 
      <ul class="pure-menu-list"> 
        <li class="pure-menu-item"><a href="/wizard/" class="pure-menu-link">Your Own Conjugator</a></li> 
      </ul> 
      <a class="pure-menu-heading" href="/webverbix/termsofuse.html">About ...</a> </div> 
  </div> 
  <div id="main"> 
    <div class="header"> 
      <h1> 
        Italian 
        :
        essere 
      </h1> 
      <h2> 
        Italian 
        verb ' 
        essere 
        ' conjugated in all tenses</h2> 
    </div> 
    <div class="advertising"> 
      <script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script> 
      <!-- MainLeftReactive --> 
      <ins class="adsbygoogle"
     style="display:block"
     data-ad-client="ca-pub-3716807887832772"
     data-ad-slot="9886612560"
     data-ad-format="auto"></ins> 
      <script>
(adsbygoogle = window.adsbygoogle || []).push({});
</script> 
    </div> 
    <div class="verbcontent"> 
      <p><a href="http://www.verbix.com/languages/italian.shtml" rel="prev">Conjugate another
        Italian 
        verb</a> 
        <!-- AddThis Button BEGIN --> 
        <script type="text/javascript">var addthis_pub="verbix";</script> 
        <script type="text/javascript">var addthis_config = {services_exclude: 'print',data_ga_property: 'UA-61929-7', data_track_clickback: true}</script> 
        <a href="http://www.addthis.com/bookmark.php?v=20" onMouseOver="return addthis_open(this, '', '[URL]', '[TITLE]')" onMouseOut="addthis_close()" onClick="return addthis_sendto()"><img src="http://s7.addthis.com/static/btn/lg-share-en.gif" width="125" height="16" alt="Bookmark and Share" style="border:0"/></a> 
        <script type="text/javascript" src="http://s7.addthis.com/js/200/addthis_widget.js"></script> 
        <!-- AddThis Button END --> 
      </p> 

      <div class="pure-g verbtable"> <!-- #BeginEditable "Full_width_text" --> 
        <div class="pure-u-1-1"> 
          <h2>Nominal Forms</h2> 
          <div class="pure-g"> 
            <div class="pure-u-1-3"> 
              <p><b>Infinito:<br>Participio presente:<br>Gerundio:<br>Participio passato:</b></p> 
            </div> 
            <div class="pure-u-1-3"> 
              <p><span class="normal">essere</span><br>
<span class="normal">essente</span><br>
<span class="normal">essendo</span><br>
<span class="irregular">stato</span><br>
</p> 
            </div> 
            <div class="pure-u-1-3"> 
              <p> 
                <span class="normal">avere stato</span><br>


                <span class="normal">avendo stato</span><br>

                <span class="normal">avente stato</span><br>

              </p> 
            </div> 
          </div> 
        </div> 
        <div class="pure-u-1-1 pure-u-lg-1-2"> 
          <h2>Indicativo</h2> 
          <div class="pure-g"> 
            <div class="pure-u-1-2"> 
              <h3>Presente</h3> 
              <p> 
                <font color=#007F00 face=Courier><span class="normal">io</span>&nbsp;&nbsp;&nbsp;</font><span class="irregular">sono</span><br>
<font color=#007F00 face=Courier><span class="normal">tu</span>&nbsp;&nbsp;&nbsp;</font><span class="irregular">sei</span><br>
<font color=#007F00 face=Courier><span class="normal">lui</span>&nbsp;&nbsp;</font><span class="irregular">&egrave;</span><br>
<font color=#007F00 face=Courier><span class="normal">noi</span>&nbsp;&nbsp;</font><span class="irregular">siamo</span><br>
<font color=#007F00 face=Courier><span class="normal">voi</span>&nbsp;&nbsp;</font><span class="irregular">siete</span><br>
<font color=#007F00 face=Courier><span class="normal">loro</span>&nbsp;</font><span class="irregular">sono</span><br>

              </p> 
            </div> 
            <div class="pure-u-1-2"> 
              <h3>Passato prossimo</h3> 
              <p> 
                <font color=#007F00 face=Courier><span class="normal">io</span>&nbsp;&nbsp;&nbsp;</font><span class="normal">ho stato</span><br>
<font color=#007F00 face=Courier><span class="normal">tu</span>&nbsp;&nbsp;&nbsp;</font><span class="normal">hai stato</span><br>
<font color=#007F00 face=Courier><span class="normal">lui</span>&nbsp;&nbsp;</font><span class="normal">ha stato</span><br>
<font color=#007F00 face=Courier><span class="normal">noi</span>&nbsp;&nbsp;</font><span class="normal">abbiamo stato</span><br>
<font color=#007F00 face=Courier><span class="normal">voi</span>&nbsp;&nbsp;</font><span class="normal">avete stato</span><br>
<font color=#007F00 face=Courier><span class="normal">loro</span>&nbsp;</font><span class="normal">hanno stato</span><br>

              </p> 
            </div> 
            <div class="pure-u-1-2"> 
              <h3>Imperfetto</h3> 
              <p> 
                <font color=#007F00 face=Courier><span class="normal">io</span>&nbsp;&nbsp;&nbsp;</font><span class="irregular">ero</span><br>
<font color=#007F00 face=Courier><span class="normal">tu</span>&nbsp;&nbsp;&nbsp;</font><span class="irregular">eri</span><br>
<font color=#007F00 face=Courier><span class="normal">lui</span>&nbsp;&nbsp;</font><span class="irregular">era</span><br>
<font color=#007F00 face=Courier><span class="normal">noi</span>&nbsp;&nbsp;</font><span class="irregular">eravamo</span><br>
<font color=#007F00 face=Courier><span class="normal">voi</span>&nbsp;&nbsp;</font><span class="irregular">eravate</span><br>
<font color=#007F00 face=Courier><span class="normal">loro</span>&nbsp;</font><span class="irregular">erano</span><br>

              </p> 
            </div> 
            <div class="pure-u-1-2"> 
              <h3>Trapassato prossimo</h3> 
              <p> 
                <font color=#007F00 face=Courier><span class="normal">io</span>&nbsp;&nbsp;&nbsp;</font><span class="normal">avevo stato</span><br>
<font color=#007F00 face=Courier><span class="normal">tu</span>&nbsp;&nbsp;&nbsp;</font><span class="normal">avevi stato</span><br>
<font color=#007F00 face=Courier><span class="normal">lui</span>&nbsp;&nbsp;</font><span class="normal">aveva stato</span><br>
<font color=#007F00 face=Courier><span class="normal">noi</span>&nbsp;&nbsp;</font><span class="normal">avevamo stato</span><br>
<font color=#007F00 face=Courier><span class="normal">voi</span>&nbsp;&nbsp;</font><span class="normal">avevate stato</span><br>
<font color=#007F00 face=Courier><span class="normal">loro</span>&nbsp;</font><span class="normal">avevano stato</span><br>

              </p> 
            </div> 
            <div class="pure-u-1-2"> 
              <h3>Futuro</h3> 
              <p> 
                <font color=#007F00 face=Courier><span class="normal">io</span>&nbsp;&nbsp;&nbsp;</font><span class="irregular">sar&ograve;</span><br>
<font color=#007F00 face=Courier><span class="normal">tu</span>&nbsp;&nbsp;&nbsp;</font><span class="irregular">sarai</span><br>
<font color=#007F00 face=Courier><span class="normal">lui</span>&nbsp;&nbsp;</font><span class="irregular">sar&agrave;</span><br>
<font color=#007F00 face=Courier><span class="normal">noi</span>&nbsp;&nbsp;</font><span class="irregular">saremo</span><br>
<font color=#007F00 face=Courier><span class="normal">voi</span>&nbsp;&nbsp;</font><span class="irregular">sarete</span><br>
<font color=#007F00 face=Courier><span class="normal">loro</span>&nbsp;</font><span class="irregular">saranno</span><br>

              </p> 
            </div> 
            <div class="pure-u-1-2"> 
              <h3>Futuro anteriore</h3> 
              <p> 
                <font color=#007F00 face=Courier><span class="normal">io</span>&nbsp;&nbsp;&nbsp;</font><span class="normal">avr&ograve; stato</span><br>
<font color=#007F00 face=Courier><span class="normal">tu</span>&nbsp;&nbsp;&nbsp;</font><span class="normal">avrai stato</span><br>
<font color=#007F00 face=Courier><span class="normal">lui</span>&nbsp;&nbsp;</font><span class="normal">avr&agrave; stato</span><br>
<font color=#007F00 face=Courier><span class="normal">noi</span>&nbsp;&nbsp;</font><span class="normal">avremo stato</span><br>
<font color=#007F00 face=Courier><span class="normal">voi</span>&nbsp;&nbsp;</font><span class="normal">avrete stato</span><br>
<font color=#007F00 face=Courier><span class="normal">loro</span>&nbsp;</font><span class="normal">avranno stato</span><br>

              </p> 
            </div> 
            <div class="pure-u-1-2"> 
              <h3>Passato remoto</h3> 
              <p> 
                <font color=#007F00 face=Courier><span class="normal">io</span>&nbsp;&nbsp;&nbsp;</font><span class="irregular">fui</span><br>
<font color=#007F00 face=Courier><span class="normal">tu</span>&nbsp;&nbsp;&nbsp;</font><span class="irregular">fosti</span><br>
<font color=#007F00 face=Courier><span class="normal">lui</span>&nbsp;&nbsp;</font><span class="irregular">fu</span><br>
<font color=#007F00 face=Courier><span class="normal">noi</span>&nbsp;&nbsp;</font><span class="irregular">fummo</span><br>
<font color=#007F00 face=Courier><span class="normal">voi</span>&nbsp;&nbsp;</font><span class="irregular">foste</span><br>
<font color=#007F00 face=Courier><span class="normal">loro</span>&nbsp;</font><span class="irregular">furono</span><br>

              </p> 
            </div> 
            <div class="pure-u-1-2"> 
              <h3>Trapassato remoto</h3> 
              <p> 
                <font color=#007F00 face=Courier><span class="normal">io</span>&nbsp;&nbsp;&nbsp;</font><span class="normal">ebbi stato</span><br>
<font color=#007F00 face=Courier><span class="normal">tu</span>&nbsp;&nbsp;&nbsp;</font><span class="normal">avesti stato</span><br>
<font color=#007F00 face=Courier><span class="normal">lui</span>&nbsp;&nbsp;</font><span class="normal">ebbe stato</span><br>
<font color=#007F00 face=Courier><span class="normal">noi</span>&nbsp;&nbsp;</font><span class="normal">avemmo stato</span><br>
<font color=#007F00 face=Courier><span class="normal">voi</span>&nbsp;&nbsp;</font><span class="normal">aveste stato</span><br>
<font color=#007F00 face=Courier><span class="normal">loro</span>&nbsp;</font><span class="normal">ebbero stato</span><br>

              </p> 
            </div> 
          </div> 
        </div> 
        <div class="pure-u-1-1 pure-u-lg-1-2"> 
          <h2>Congiuntivo</h2> 
          <div class="pure-g"> 
            <div class="pure-u-1-2"> 
              <h3>Presente</h3> ...........................