从Github链接下载CSV会产生令牌化错误

时间:2017-11-12 22:26:52

标签: python-3.x pandas csv url data-science

尝试从Github导入此CSV时出现错误:

import pandas as pd

url = "https://github.com/authman/DAT210x/blob/master/Module2/Datasets/tutorial.csv"
df = pd.read_csv(url)

给出以下例外:

ParserError: Error tokenizing data. C error: Expected 1 fields in line 114, saw 3

1 个答案:

答案 0 :(得分:3)

该网址是HTML网页,您需要“原始”链接:

In [11]: url = "https://github.com/authman/DAT210x/raw/master/Module2/Datasets/tutorial.csv"

In [12]: pd.read_csv(url)
Out[12]:
       col0      col1      col2      col3
0 -0.722876 -1.330682  1.309208  0.232378
1  1.160396 -0.730879  0.677368  1.044722
2 -1.062870 -0.503704 -0.238536 -1.417937
3  0.437078  0.362640 -0.111228 -1.649853