Web Scrape Printer EWS输出不正确

时间:2017-04-04 05:28:01

标签: python python-3.x web-scraping beautifulsoup embeddedwebserver

我正在尝试抓取我的打印机嵌入式Web服务器以获取当前的打印计数并将其写入文件。我是新手,并试图打印整个HTML,看看我到目前为止是否正确设置了脚本,输出是基础的。这是我的代码:

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

myAddress = "http://10.0.0.199/#hId-UsageReportPage"
uClient = uReq(myAddress)
pageHTML = uClient.read()
uClient.close()

pageSoup = soup(pageHTML, "lxml")

print(pageSoup.prettify())

input()

这是我的输出:

<html>
 <body>
  <p>
   ‹     TQoÚ0~ĸú©}p¼ª/SI6hÕMíŠVªn&amp;9ˆiˆ=û( iÿ}v%€ºõÉÎù»ï¾Üùsr2¸ï~¯  y        ÃÇÏ·_úÀ¸O}!£ü¸ÝÝÂyüȪŒ„¸úÆ"`‘¹b¹\ÆË‹XÛ©}«@sò6[îê¤8§œõ¢$Ä‚2ïE ÉIB`âøk¡^RÖ×aE|´6È k¾RF¸"’»Ò:¤ôqtÍ?25ˬ2ä³6à™|‘M”ÄÄÊ9&gt;iû|?žA
y/j•w6K™¨Y—žu·‹gŽ½%nÇÔ  T¥ªž¡ðÅÓ£Âe½Ä™ódË”9Z—è
DÚc¯¢×9–v@Æ€ö]â:Q"š©EÉXçk?DUÿ¨
   <ef:>
Ç®Æ!W¦õ³ÂÍò^²Ð\½¼+솢
.e5]ø¡s8¿ô:‡ ±¬*´o“6ÜÈ
ß(*÷*¼ÊÈJé\
×\ªºÎ‘HœÐA…?H´Ûk`›#kl3Ú±­ªp£·›yV¢´G HN8‡xO:p~Üâ‰ÖôîËqrûíùŸ—…h|ã…óä‘šœú)ÀI
ËÉ™¯á?ãRg’”®b‹þ:dxwÑ`°³nÔrqRéí~Oc¥ùÌñ #¼_¦÷Õkyh*çmèŸ-‹¹¯        2ËÐ9 oðFsŠ0N„
ܦ7ôtXÉq‰Ð"Ñþ@ÂØê¥ó}¤Bz».ŒÑ–\Ùpmº”ˆ–\±1¤h^׿0…ÔáÔ
   </ef:>
  </p>
 </body>
</html>

我不确定为什么会这样。 我正在使用python 3.6.0和Beautiful Soup 4。 另外,我的打印机是HP Photosmart D110a,如果它有帮助的话。

(更新) 这是HTML:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
  <script type="text/javascript">
  frameWorkObj = {};
  frameWorkObj.pageMgrDataPathPrefix = "/webApps/Layout/";
  </script>

  <script src="/framework/framework.js" type="text/javascript"></script>

  <link href="/webApps/Layout/layout.css" rel="stylesheet" type="text/css" />

  <script src="/webApps/Layout/header.js" type="text/javascript"></script>
</head>

<body>
<iframe id="pgm-history-iframe" src="/framework/HistoryFrame.html" style="display: none;"></iframe>
<iframe src="/framework/cookie/cookie.html" style="display: none;"></iframe>

  <div id="pgm-language-div"></div>
  <div id="pgm-banner"></div>
  <div id="pgm-top-pane"></div>
  <div id="pgm-title-div"></div>

  <div class="pgm-container">
  <div id="pgm-left-pane"></div>

  <div class="outerContentPane">
  <div id="contentPane" class="contentPane"></div>
  </div>
  <div class="clear"></div>
  </div> <!-- .pgm-container -->

  <div id="pgm-footer"></div>
  <div id="pgm-page-ts-div"></div>

<script type="text/javascript">
// frame buster
if(top != self)
  top.location.replace(self.location.href);
</script>

<noscript>
<div id="pgm-no-js-text">
<p>JavaScript is required to access this website.</p>

<p>Please enable JavaScript or use a browser that supports JavaScript.</p>
</div>
</noscript>
</body>
</html>

0 个答案:

没有答案