初学者问题。如果运行以下命令:
df = pd.read_html('https://coinmarketcap.com/currencies/veritaseum/historical-data/?start=20180101&end=20180105')
df
它创建一个表:
[ Date Open* High Low Close** Volume Market Cap
0 Jan 05, 2018 366.43 395.69 356.26 380.11 426543 746285248
1 Jan 04, 2018 376.58 397.02 353.17 368.46 734384 766956544
2 Jan 03, 2018 378.55 395.28 352.05 376.05 1000590 770974464
3 Jan 02, 2018 343.79 393.92 335.60 377.54 2011340 700168512
4 Jan 01, 2018 338.14 357.18 325.33 343.12 890956 688681984]
但是当您尝试创建DataFrame时:
data_table = pd.DataFrame(df)
data_table
什么都没有发生。是由于Date是字符串吗?你如何克服它?预先感谢。
答案 0 :(得分:1)
首先需要选择read_html
返回的数据帧列表的第一个值,然后使用to_datetime
,如果要DatetimeIndex
添加set_index
:
url = 'https://coinmarketcap.com/currencies/veritaseum/historical-data/?start=20180101&end=20180105'
df = pd.read_html(url)[0]
df['Date'] = pd.to_datetime(df['Date'], format='%b %d, %Y')
df = df.set_index('Date')
print(df)
Open* High Low Close** Volume Market Cap
Date
2018-01-05 366.43 395.69 356.26 380.11 426543 746285248
2018-01-04 376.58 397.02 353.17 368.46 734384 766956544
2018-01-03 378.55 395.28 352.05 376.05 1000590 770974464
2018-01-02 343.79 393.92 335.60 377.54 2011340 700168512
2018-01-01 338.14 357.18 325.33 343.12 890956 688681984
或使用参数parse_dates
,并在需要时使用DatetimeIndex
并添加index_col
:
df = pd.read_html(url, index_col=0, parse_dates=[0])[0]
print(df)
Open* High Low Close** Volume Market Cap
Date
2018-01-05 366.43 395.69 356.26 380.11 426543 746285248
2018-01-04 376.58 397.02 353.17 368.46 734384 766956544
2018-01-03 378.55 395.28 352.05 376.05 1000590 770974464
2018-01-02 343.79 393.92 335.60 377.54 2011340 700168512
2018-01-01 338.14 357.18 325.33 343.12 890956 688681984