我有一个股票价格的CSV文件,如下所示:
Index Date Time Open High Low Close
0 01/01/2000 900 10 12 9 11
1 01/01/2000 901
我要做的是删除实际索引,将Date-Time列合并为一个,并将其用作格式化为Panda TimeSeries的索引。 感谢您的帮助!
答案 0 :(得分:1)
df = pd.DataFrame({'Date':['01/01/2000'], 'Time':['900']})
# Make it 24 hour time by adding leading zero
df['DateTime'] = df['Date'] + ' 0' + df['Time']
# Let pandas figure out the datetime structure
df['DateTime'] = pd.to_datetime(df['DateTime'])
df.set_index('DateTime', inplace=True)
>>> df
Date Time
DateTime
2000-01-01 09:00:00 01/01/2000 900
答案 1 :(得分:0)
您的数据:
data = pd.DataFrame({'Date': ['01/01/2000', '01/01/2000'], 'Time': [900, 901], 'Open': [10, None],
'High': [12, None], 'Low': [9, None], 'Close': [11, None]})
这可能不是更好的解决方案,但是可以。
data['Date'] = pd.to_datetime(data['Date'])
data['Minutes'] = data['Time'].astype(str).str[-2:] #get minutes from Time
data['Hours'] = data['Time'].astype(str).str[:-2] #get hours from Time
#set to index Date and Time
data.index = data['Date'] + pd.to_timedelta(data['Hours'].astype(int), unit='h') + \
pd.to_timedelta(data['Minutes'].astype(int), unit='m')
输出:
Date Time Open High Low Close Minutes Hours
2000-01-01 09:00:00 2000-01-01 900 10.0 12.0 9.0 11.0 00 9
2000-01-01 09:01:00 2000-01-01 901 NaN NaN NaN NaN 01 9
然后删除列:
data.drop(['Date', 'Time', 'Minutes', 'Hours'], 1)
最终输出:
Open High Low Close
2000-01-01 09:00:00 10.0 12.0 9.0 11.0
2000-01-01 09:01:00 NaN NaN NaN NaN