数据帧如下(3项...原始数以百计):
Log1 Date2 Log2 Date3 Log3
Date1
01.01.2000 1000 02.01.2000 2000 01.01.2000 3000
02.01.2000 1050 03.01.2000 1950 02.01.2000 3020
03.01.2000 1100 04.01.2000 2000 03.01.2000 3000
所需的前景,对齐日期:
Log1 Log2 Log3
Date
01.01.2000 1,000 nan 3,000
02.01.2000 1,050 2,000 3,020
03.01.2000 1,100 1,950 3,000
04.01.2000 nan 2,000 nan
数据框的简短示例:
BBAS3 Data.1 PETR4 Data.2 TRSD Data.3 JKHD
Data
2020-10-05 30.15 2020-10-05 19.91 2020-10-05 30.15 2020-10-05 19.91
2020-10-02 29.71 2020-10-02 19.02 2020-10-02 29.71 2020-10-01 19.85
2020-10-01 29.79 2020-10-01 19.85 2020-10-01 29.79 2020-09-30 19.61
2020-09-30 29.62 2020-09-30 19.61 2020-09-30 29.62 2020-09-29 19.31
2020-09-29 29.76 2020-09-29 19.31 2020-09-29 29.76 2020-09-28 19.63
答案 0 :(得分:1)
如果输入数据具有DatetimeIndex
循环不成对并配对列名称的想法,请一起创建Series
和concat
:
#convert Datetimeindex to column
df1 = df.reset_index()
zipped = zip(df1.columns[::2], df1.columns[1::2])
df1 = pd.concat([df1.set_index(a)[b] for a, b in zipped], axis=1)
df1.index = pd.to_datetime(df1.index)
df1 = df1.sort_index()
print (df1)
BBAS3 PETR4 TRSD JKHD
2020-09-28 NaN NaN NaN 19.63
2020-09-29 29.76 19.31 29.76 19.31
2020-09-30 29.62 19.61 29.62 19.61
2020-10-01 29.79 19.85 29.79 19.85
2020-10-02 29.71 19.02 29.71 NaN
2020-10-05 30.15 19.91 30.15 19.91
编辑:
#sample data generate error - because duplicated dates in soem column like here in Data
print (df)
BBAS3 Data.1 PETR4 Data.2 TRSD Data.3 JKHD
Data
2020-10-05 200.00 2020-10-05 19.91 2020-10-05 30.15 2020-10-05 19.91
2020-10-05 100.00 2020-10-02 19.02 2020-10-02 29.71 2020-10-01 19.85
2020-10-01 29.79 2020-10-01 19.85 2020-10-01 29.79 2020-09-30 19.61
2020-09-30 29.62 2020-09-30 19.61 2020-09-30 29.62 2020-09-29 19.31
2020-09-29 29.76 2020-09-29 19.31 2020-09-29 29.76 2020-09-28 19.63
df1 = df.reset_index()
zipped = zip(df1.columns[::2], df1.columns[1::2])
df1 = pd.concat([df1.groupby(a)[b].sum() for a, b in zipped], axis=1)
df1.index = pd.to_datetime(df1.index)
df1 = df1.sort_index()
print (df1)
BBAS3 PETR4 TRSD JKHD
2020-09-28 NaN NaN NaN 19.63
2020-09-29 29.76 19.31 29.76 19.31
2020-09-30 29.62 19.61 29.62 19.61
2020-10-01 29.79 19.85 29.79 19.85
2020-10-02 NaN 19.02 29.71 NaN
2020-10-05 300.00 19.91 30.15 19.91