使用熊猫read_html()

时间:2020-06-10 07:26:14

标签: python pandas web-scraping

我正在研究一个需要从我的大学站点进行网页抓取的项目。大学站点是https://erp.aktu.ac.in/WebPages/OneView/OneView.aspx。当我输入编号(从001到100的1513310 *** ***)时,会显示结果,但是当我复制URL并再次粘贴到浏览器中时,它会将我重定向到不再输入编号。我假设从pd.read_html()函数获取数据时发生的情况相同。有什么办法可以绕过它?

import pandas as pd
>>> pd.read_html('https://erp.aktu.ac.in/WebPages/OneView/OVEngine.aspx?enc=NnCOpTxI4+e2v6OtxoLaIVhtGRRyQHWhl51tE9IxJAlzwgkcwHudd8EEQQF6+chV')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python36\lib\site-packages\pandas\io\html.py", line 1100, in read_html
    displayed_only=displayed_only,
  File "C:\Python36\lib\site-packages\pandas\io\html.py", line 915, in _parse
    raise retained
  File "C:\Python36\lib\site-packages\pandas\io\html.py", line 895, in _parse
    tables = p.parse_tables()
  File "C:\Python36\lib\site-packages\pandas\io\html.py", line 213, in parse_tables
    tables = self._parse_tables(self._build_doc(), self.match, self.attrs)
  File "C:\Python36\lib\site-packages\pandas\io\html.py", line 545, in _parse_tables
    raise ValueError("No tables found")
ValueError: No tables found

显示错误,因为无法获得结果页面。周围有什么解决办法吗?

0 个答案:

没有答案