我正在尝试从此网址-“ https://archive.ics.uci.edu/ml/machine-learning-databases/parkinsons/parkinsons.data”中读取数据到熊猫数据框中。
我使用了这种技术:
park_df = pd.read_html('https://archive.ics.uci.edu/ml/machine-learning-
databases/parkinsons/parkinsons.data', header=0, flavor='bs4')
但出现如下错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-18-804373f977ab> in <module>()
----> 1 park_df = pd.read_html('https://archive.ics.uci.edu/ml/machine-
learning-databases/parkinsons/parkinsons.data', header=0, flavor='bs4')
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\html.py in
read_html(io, match, flavor, header, index_col, skiprows, attrs,
parse_dates, tupleize_cols, thousands, encoding, decimal, converters,
na_values, keep_default_na, displayed_only)
985 decimal=decimal, converters=converters,
na_values=na_values,
986 keep_default_na=keep_default_na,
--> 987 displayed_only=displayed_only)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\html.py in
_parse(flavor, io, match, attrs, encoding, displayed_only, **kwargs)
813 break
814 else:
--> 815 raise_with_traceback(retained)
816
817 ret = []
~\AppData\Local\Continuum\anaconda3\lib\site-
packages\pandas\compat\__init__.py in raise_with_traceback(exc, traceback)
402 if traceback == Ellipsis:
403 _, _, traceback = sys.exc_info()
--> 404 raise exc.with_traceback(traceback)
405 else:
406 # this version of raise is a syntax error in Python 3
ValueError: No tables found
您能建议我在这里做错什么吗,还有其他更好的选择。请确实打开网址以检查数据的外观,第一行的标题(包含列名)和下面的数据。
答案 0 :(得分:2)
函数read_html
用于将html表转换为pandas DataFrame,对于csv格式转换,请使用read_csv
:
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/parkinsons/parkinsons.data'
df = pd.read_csv(url)
print (df.head())
name MDVP:Fo(Hz) MDVP:Fhi(Hz) MDVP:Flo(Hz) MDVP:Jitter(%) \
0 phon_R01_S01_1 119.992 157.302 74.997 0.00784
1 phon_R01_S01_2 122.400 148.650 113.819 0.00968
2 phon_R01_S01_3 116.682 131.111 111.555 0.01050
3 phon_R01_S01_4 116.676 137.871 111.366 0.00997
4 phon_R01_S01_5 116.014 141.781 110.655 0.01284
MDVP:Jitter(Abs) MDVP:RAP MDVP:PPQ Jitter:DDP MDVP:Shimmer ... \
0 0.00007 0.00370 0.00554 0.01109 0.04374 ...
1 0.00008 0.00465 0.00696 0.01394 0.06134 ...
2 0.00009 0.00544 0.00781 0.01633 0.05233 ...
3 0.00009 0.00502 0.00698 0.01505 0.05492 ...
4 0.00011 0.00655 0.00908 0.01966 0.06425 ...
Shimmer:DDA NHR HNR status RPDE DFA spread1 \
0 0.06545 0.02211 21.033 1 0.414783 0.815285 -4.813031
1 0.09403 0.01929 19.085 1 0.458359 0.819521 -4.075192
2 0.08270 0.01309 20.651 1 0.429895 0.825288 -4.443179
3 0.08771 0.01353 20.644 1 0.434969 0.819235 -4.117501
4 0.10470 0.01767 19.649 1 0.417356 0.823484 -3.747787
spread2 D2 PPE
0 0.266482 2.301442 0.284654
1 0.335590 2.486855 0.368674
2 0.311173 2.342259 0.332634
3 0.334147 2.405554 0.368975
4 0.234513 2.332180 0.410335
[5 rows x 24 columns]