重新索引DataFrame并合并两列

时间:2018-12-28 16:13:37

标签: python pandas dataframe

我有一个股票价格的CSV文件,如下所示:

Index Date       Time Open High Low Close
0     01/01/2000 900  10   12   9   11
1     01/01/2000 901

我要做的是删除实际索引,将Date-Time列合并为一个,并将其用作格式化为Panda TimeSeries的索引。 感谢您的帮助!

2 个答案:

答案 0 :(得分:1)

    df = pd.DataFrame({'Date':['01/01/2000'], 'Time':['900']})
    # Make it 24 hour time by adding leading zero
    df['DateTime'] = df['Date'] + ' 0' + df['Time']  
    # Let pandas figure out the datetime structure
    df['DateTime'] = pd.to_datetime(df['DateTime'])
    df.set_index('DateTime', inplace=True)

>>> df
                           Date Time
DateTime
2000-01-01 09:00:00  01/01/2000  900

答案 1 :(得分:0)

您的数据:

data = pd.DataFrame({'Date': ['01/01/2000', '01/01/2000'], 'Time': [900, 901], 'Open': [10, None],
                     'High': [12, None], 'Low': [9, None], 'Close': [11, None]})

这可能不是更好的解决方案,但是可以。

data['Date'] = pd.to_datetime(data['Date'])
data['Minutes'] = data['Time'].astype(str).str[-2:] #get minutes from Time
data['Hours'] = data['Time'].astype(str).str[:-2] #get hours from Time
#set to index Date and Time
data.index = data['Date'] + pd.to_timedelta(data['Hours'].astype(int), unit='h') + \
    pd.to_timedelta(data['Minutes'].astype(int), unit='m')

输出:

                          Date  Time  Open  High  Low  Close Minutes Hours
2000-01-01 09:00:00 2000-01-01   900  10.0  12.0  9.0   11.0      00     9
2000-01-01 09:01:00 2000-01-01   901   NaN   NaN  NaN    NaN      01     9

然后删除列:

data.drop(['Date', 'Time', 'Minutes', 'Hours'], 1)

最终输出:

                     Open  High  Low  Close
2000-01-01 09:00:00  10.0  12.0  9.0   11.0
2000-01-01 09:01:00   NaN   NaN  NaN    NaN