我试图运行以下代码,但无济于事。据我所知,没有任何语法错误。
import quandl
import pandas as pd
fifty_states =pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')
print(fifty_states)
运行此代码时出现以下错误:
追踪(最近一次呼叫最后一次):
文件" C:/ Users / Dave / Documents / Python Files / helloworld.py",第15行,in fiddy_states = pd.read_html(' http://simple.wikipedia.org/wiki/List_of_U.S._states')
文件" C:\ Python35 \ lib \ site-packages \ pandas \ io \ html.py",第874行,在read_html中 parse_dates,tupleize_cols,千,attrs,编码)
文件" C:\ Python35 \ lib \ site-packages \ pandas \ io \ html.py",第726行,在_parse中 parser = _parser_dispatch(flav)
文件" C:\ Python35 \ lib \ site-packages \ pandas \ io \ html.py",第685行,在_parser_dispatch中 引发ImportError("找不到lxml,请安装它")
ImportError:找不到lxml,请安装
不太清楚为什么会发生这种情况,因为我(应该)拥有运行此代码所需的所有软件包。我在安装lxml和python3-lxml时遇到问题,因为软件包无法安装。作为备份,我已经安装了以下内容:
python-dev libxml2-dev libxslt1-dev zlib1g-dev
除了html5lib'之外,我读过的是lxml的合适替代品。
此时还不确定还有什么可做,因为搜索类似的更正(即安装lxml)并不适用于我(我无法通过命令行上的pip以任何格式安装lxml)
非常感谢任何帮助。
编辑:似乎我的计算机上从未安装lxml
。这很奇怪,因为我无法通过pip install lxml
安装它。这是我在尝试安装时获得的错误日志:
Collecting lxml
Using cached lxml-3.6.4.tar.gz
Building wheels for collected packages: lxml
Running setup.py bdist_wheel for lxml ... error
Complete output from command c:\python35\python.exe -u -c "import setuptools,
tokenize;__file__='C:\\Users\\Dwang\\AppData\\Local\\Temp\\pip-build-738bf61u\\l
xml\\setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().rep
lace('\r\n', '\n'), __file__, 'exec'))" bdist_wheel -d C:\Users\Dwang\AppData\Lo
cal\Temp\tmpm9z4yol6pip-wheel- --python-tag cp35:
Building lxml version 3.6.4.
Building without Cython.
ERROR: b"'xslt-config' is not recognized as an internal or external command,\r
\noperable program or batch file.\r\n"
** make sure the development packages of libxml2 and libxslt are installed **
Using build configuration of libxslt
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win-amd64-3.5
creating build\lib.win-amd64-3.5\lxml
copying src\lxml\builder.py -> build\lib.win-amd64-3.5\lxml
copying src\lxml\cssselect.py -> build\lib.win-amd64-3.5\lxml
copying src\lxml\doctestcompare.py -> build\lib.win-amd64-3.5\lxml
copying src\lxml\ElementInclude.py -> build\lib.win-amd64-3.5\lxml
copying src\lxml\pyclasslookup.py -> build\lib.win-amd64-3.5\lxml
copying src\lxml\sax.py -> build\lib.win-amd64-3.5\lxml
copying src\lxml\usedoctest.py -> build\lib.win-amd64-3.5\lxml
copying src\lxml\_elementpath.py -> build\lib.win-amd64-3.5\lxml
copying src\lxml\__init__.py -> build\lib.win-amd64-3.5\lxml
creating build\lib.win-amd64-3.5\lxml\includes
copying src\lxml\includes\__init__.py -> build\lib.win-amd64-3.5\lxml\includes
creating build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\builder.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\clean.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\defs.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\diff.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\ElementSoup.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\formfill.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\html5parser.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\soupparser.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\usedoctest.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\_diffcommand.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\_html5builder.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\_setmixin.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\__init__.py -> build\lib.win-amd64-3.5\lxml\html
creating build\lib.win-amd64-3.5\lxml\isoschematron
copying src\lxml\isoschematron\__init__.py -> build\lib.win-amd64-3.5\lxml\iso
schematron
copying src\lxml\lxml.etree.h -> build\lib.win-amd64-3.5\lxml
copying src\lxml\lxml.etree_api.h -> build\lib.win-amd64-3.5\lxml
copying src\lxml\includes\c14n.pxd -> build\lib.win-amd64-3.5\lxml\includes
copying src\lxml\includes\config.pxd -> build\lib.win-amd64-3.5\lxml\includes
copying src\lxml\includes\dtdvalid.pxd -> build\lib.win-amd64-3.5\lxml\include
s
copying src\lxml\includes\etreepublic.pxd -> build\lib.win-amd64-3.5\lxml\incl
udes
copying src\lxml\includes\htmlparser.pxd -> build\lib.win-amd64-3.5\lxml\inclu
des
copying src\lxml\includes\relaxng.pxd -> build\lib.win-amd64-3.5\lxml\includes
copying src\lxml\includes\schematron.pxd -> build\lib.win-amd64-3.5\lxml\inclu
des
copying src\lxml\includes\tree.pxd -> build\lib.win-amd64-3.5\lxml\includes
copying src\lxml\includes\uri.pxd -> build\lib.win-amd64-3.5\lxml\includes
copying src\lxml\includes\xinclude.pxd -> build\lib.win-amd64-3.5\lxml\include
s
copying src\lxml\includes\xmlerror.pxd -> build\lib.win-amd64-3.5\lxml\include
s
copying src\lxml\includes\xmlparser.pxd -> build\lib.win-amd64-3.5\lxml\includ
es
copying src\lxml\includes\xmlschema.pxd -> build\lib.win-amd64-3.5\lxml\includ
es
copying src\lxml\includes\xpath.pxd -> build\lib.win-amd64-3.5\lxml\includes
copying src\lxml\includes\xslt.pxd -> build\lib.win-amd64-3.5\lxml\includes
copying src\lxml\includes\etree_defs.h -> build\lib.win-amd64-3.5\lxml\include
s
copying src\lxml\includes\lxml-version.h -> build\lib.win-amd64-3.5\lxml\inclu
des
creating build\lib.win-amd64-3.5\lxml\isoschematron\resources
creating build\lib.win-amd64-3.5\lxml\isoschematron\resources\rng
copying src\lxml\isoschematron\resources\rng\iso-schematron.rng -> build\lib.w
in-amd64-3.5\lxml\isoschematron\resources\rng
creating build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl
copying src\lxml\isoschematron\resources\xsl\RNG2Schtrn.xsl -> build\lib.win-a
md64-3.5\lxml\isoschematron\resources\xsl
copying src\lxml\isoschematron\resources\xsl\XSD2Schtrn.xsl -> build\lib.win-a
md64-3.5\lxml\isoschematron\resources\xsl
creating build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-schematr
on-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_abstract
_expand.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-sche
matron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_dsdl_inc
lude.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-schemat
ron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_schematr
on_message.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-s
chematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_schematr
on_skeleton_for_xslt1.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resource
s\xsl\iso-schematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_svrl_for
_xslt1.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-schem
atron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\readme.txt -
> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-schematron-xslt1
running build_ext
building 'lxml.etree' extension
error: Unable to find vcvarsall.bat
----------------------------------------
Failed building wheel for lxml
Running setup.py clean for lxml
Failed to build lxml
Installing collected packages: lxml
Running setup.py install for lxml ... error
Complete output from command c:\python35\python.exe -u -c "import setuptools
, tokenize;__file__='C:\\Users\\Dwang\\AppData\\Local\\Temp\\pip-build-738bf61u\
\lxml\\setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().r
eplace('\r\n', '\n'), __file__, 'exec'))" install --record C:\Users\Dwang\AppDat
a\Local\Temp\pip-4_tf2u3a-record\install-record.txt --single-version-externally-
managed --compile:
Building lxml version 3.6.4.
Building without Cython.
ERROR: b"'xslt-config' is not recognized as an internal or external command,
\r\noperable program or batch file.\r\n"
** make sure the development packages of libxml2 and libxslt are installed *
*
Using build configuration of libxslt
running install
running build
running build_py
creating build
creating build\lib.win-amd64-3.5
creating build\lib.win-amd64-3.5\lxml
copying src\lxml\builder.py -> build\lib.win-amd64-3.5\lxml
copying src\lxml\cssselect.py -> build\lib.win-amd64-3.5\lxml
copying src\lxml\doctestcompare.py -> build\lib.win-amd64-3.5\lxml
copying src\lxml\ElementInclude.py -> build\lib.win-amd64-3.5\lxml
copying src\lxml\pyclasslookup.py -> build\lib.win-amd64-3.5\lxml
copying src\lxml\sax.py -> build\lib.win-amd64-3.5\lxml
copying src\lxml\usedoctest.py -> build\lib.win-amd64-3.5\lxml
copying src\lxml\_elementpath.py -> build\lib.win-amd64-3.5\lxml
copying src\lxml\__init__.py -> build\lib.win-amd64-3.5\lxml
creating build\lib.win-amd64-3.5\lxml\includes
copying src\lxml\includes\__init__.py -> build\lib.win-amd64-3.5\lxml\includ
es
creating build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\builder.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\clean.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\defs.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\diff.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\ElementSoup.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\formfill.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\html5parser.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\soupparser.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\usedoctest.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\_diffcommand.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\_html5builder.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\_setmixin.py -> build\lib.win-amd64-3.5\lxml\html
copying src\lxml\html\__init__.py -> build\lib.win-amd64-3.5\lxml\html
creating build\lib.win-amd64-3.5\lxml\isoschematron
copying src\lxml\isoschematron\__init__.py -> build\lib.win-amd64-3.5\lxml\i
soschematron
copying src\lxml\lxml.etree.h -> build\lib.win-amd64-3.5\lxml
copying src\lxml\lxml.etree_api.h -> build\lib.win-amd64-3.5\lxml
copying src\lxml\includes\c14n.pxd -> build\lib.win-amd64-3.5\lxml\includes
copying src\lxml\includes\config.pxd -> build\lib.win-amd64-3.5\lxml\include
s
copying src\lxml\includes\dtdvalid.pxd -> build\lib.win-amd64-3.5\lxml\inclu
des
copying src\lxml\includes\etreepublic.pxd -> build\lib.win-amd64-3.5\lxml\in
cludes
copying src\lxml\includes\htmlparser.pxd -> build\lib.win-amd64-3.5\lxml\inc
ludes
copying src\lxml\includes\relaxng.pxd -> build\lib.win-amd64-3.5\lxml\includ
es
copying src\lxml\includes\schematron.pxd -> build\lib.win-amd64-3.5\lxml\inc
ludes
copying src\lxml\includes\tree.pxd -> build\lib.win-amd64-3.5\lxml\includes
copying src\lxml\includes\uri.pxd -> build\lib.win-amd64-3.5\lxml\includes
copying src\lxml\includes\xinclude.pxd -> build\lib.win-amd64-3.5\lxml\inclu
des
copying src\lxml\includes\xmlerror.pxd -> build\lib.win-amd64-3.5\lxml\inclu
des
copying src\lxml\includes\xmlparser.pxd -> build\lib.win-amd64-3.5\lxml\incl
udes
copying src\lxml\includes\xmlschema.pxd -> build\lib.win-amd64-3.5\lxml\incl
udes
copying src\lxml\includes\xpath.pxd -> build\lib.win-amd64-3.5\lxml\includes
copying src\lxml\includes\xslt.pxd -> build\lib.win-amd64-3.5\lxml\includes
copying src\lxml\includes\etree_defs.h -> build\lib.win-amd64-3.5\lxml\inclu
des
copying src\lxml\includes\lxml-version.h -> build\lib.win-amd64-3.5\lxml\inc
ludes
creating build\lib.win-amd64-3.5\lxml\isoschematron\resources
creating build\lib.win-amd64-3.5\lxml\isoschematron\resources\rng
copying src\lxml\isoschematron\resources\rng\iso-schematron.rng -> build\lib
.win-amd64-3.5\lxml\isoschematron\resources\rng
creating build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl
copying src\lxml\isoschematron\resources\xsl\RNG2Schtrn.xsl -> build\lib.win
-amd64-3.5\lxml\isoschematron\resources\xsl
copying src\lxml\isoschematron\resources\xsl\XSD2Schtrn.xsl -> build\lib.win
-amd64-3.5\lxml\isoschematron\resources\xsl
creating build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-schema
tron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_abstra
ct_expand.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-sc
hematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_dsdl_i
nclude.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-schem
atron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_schema
tron_message.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso
-schematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_schema
tron_skeleton_for_xslt1.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resour
ces\xsl\iso-schematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_svrl_f
or_xslt1.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-sch
ematron-xslt1
copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\readme.txt
-> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-schematron-xslt
1
running build_ext
building 'lxml.etree' extension
error: Unable to find vcvarsall.bat
----------------------------------------
Command "c:\python35\python.exe -u -c "import setuptools, tokenize;__file__='C:\
\Users\\Dwang\\AppData\\Local\\Temp\\pip-build-738bf61u\\lxml\\setup.py';exec(co
mpile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __
file__, 'exec'))" install --record C:\Users\Dwang\AppData\Local\Temp\pip-4_tf2u3
a-record\install-record.txt --single-version-externally-managed --compile" faile
d with error code 1 in C:\Users\Dwang\AppData\Local\Temp\pip-build-738bf61u\lxml
\
答案 0 :(得分:6)
根据我的理解并根据docs,如果read_html()
无法使用lxml
,它应该回归到html5lib
,但看起来它不会在您的情况下发生并且抛出错误。
尝试明确陈述flavor
:
fifty_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states', flavor='html5lib`)
答案 1 :(得分:1)
试试
$ conda install -c conda-forge lxml
答案 2 :(得分:0)
我在 conda 环境中使用最新版本的 pandas 和 lxml 时遇到了同样的问题。
验证者:
conda list | findstr lxml
conda list | findstr pandas
(findstr 是 Windows 版本的 grep)
当我在重新安装软件包后重新启动 jupyterkernel 时,我仍然无法让 pd.read_html() 工作,但奇怪的是,它允许我传递一个要解析的字符串而不是一个 url,而没有任何抱怨。所以我跑了:
import subprocess
import pandas as pd
s = subprocess.check_output("curl https://www.myurl.com/page.html")
df = pd.read_html(io=s)
我不知道为什么这与只允许 Pandas 获取页面有什么不同,但它有效,所以我想我会在这里分享它:)
答案 3 :(得分:0)
我遇到了同样的问题,虽然上面的答案让我很清楚。它没有解决我的问题。我的问题存在的原因是因为在撰写本文时,我无法通过 pip3 安装 Pandas,安装至少需要 30 分钟,所以我必须找到一个更可行的解决方案:这是我采取的步骤。< /p>
我希望这能像对我一样帮助别人! :-)