我正在尝试对火山数据进行可视化处理。数据是使用lxml下载和解析的。 “火山世界”源页面在几个HTML表格中列出了火山数据,每个表格都被读取到单独的Pandas数据框中,并附加到数据框列表中。
我一直收到此错误:
OSError:读取文件'http://volcano.oregonstate.edu/oldroot/volcanoes/alpha.html'时出错:无法加载外部实体“ http://volcano.oregonstate.edu/oldroot/volcanoes/alpha.html
您可以协助执行此代码吗?
import json
from lxml import html
from mpl_toolkits.basemap import Basemap
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
url ='http://volcano.oregonstate.edu/oldroot/volcanoes/alpha.html'
xpath = '//table'
tree = html.parse(url)
tables = tree.xpath(xpath)
table_dfs = []
for idx in range(4, len(tables)):
df = pd.read_html(html.tostring(tables[idx]), header=0)[0]
table_dfs.append(df)
Traceback (most recent call last):
File "C:\Apps\Anaconda\envs\my_maps_env\lib\site-packages\IPython\core\interactiveshell.py", line 3326, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-10-d2be1cde3918>", line 3, in <module>
tree = etree.fromstring(url)
File "src/lxml/etree.pyx", line 3234, in lxml.etree.fromstring
File "src/lxml/parser.pxi", line 1876, in lxml.etree._parseMemoryDocument
File "src/lxml/parser.pxi", line 1757, in lxml.etree._parseDoc
File "src/lxml/parser.pxi", line 1068, in lxml.etree._BaseParser._parseUnicodeDoc
File "src/lxml/parser.pxi", line 601, in lxml.etree._ParserContext._handleParseResultDoc
File "src/lxml/parser.pxi", line 711, in lxml.etree._handleParseResult
File "src/lxml/parser.pxi", line 640, in lxml.etree._raiseParseError
File "<string>", line 1
XMLSyntaxError: Start tag expected, '<' not found, line 1, column 1