这是相同的csv:
Time,o1,h1,l1,c1,o2,h2,l2,c2
2019.04.08 23:59:00,1.12582,1.1261,1.12582,1.12609,1.12588,1.12591,1.12587,1.12588
2019.04.09 00:00:00,1.12587,1.12598,1.12587,1.12598,1.12583,1.1261,1.12582,1.12609
2019.04.09 00:01:00,1.12599,1.12607,1.12587,1.12607,1.12598,1.12607,1.12587,1.12607
2019.04.09 00:02:00,1.12607,1.12632,1.12605,1.12632,1.12598,1.12607,1.12587,1.12607
2019.04.09 00:03:00,1.12603,1.12614,1.12603,1.12613,1.12607,1.12632,1.12603,1.12613
2019.04.09 00:04:00,1.12613,1.12613,1.12609,1.12611,1.12607,1.12632,1.12603,1.12613
2019.04.09 00:05:00,1.12612,1.12613,1.12612,1.12613,1.12613,1.12613,1.12609,1.12613
2019.04.09 00:06:00,1.12613,1.12613,1.12613,1.12613,1.12613,1.12613,1.12609,1.12613
2019.04.09 00:07:00,1.12613,1.12613,1.1261,1.12612,1.12613,1.12613,1.1261,1.12612
2019.04.09 00:08:00,1.12612,1.12612,1.12612,1.12612,1.12613,1.12613,1.1261,1.12612
2019.04.09 00:09:00,1.12612,1.12612,1.12609,1.12609,1.12612,1.12612,1.12609,1.12609
2019.04.09 00:10:00,1.12609,1.12609,1.12602,1.12607,1.12612,1.12612,1.12609,1.12609
2019.04.09 00:11:00,1.12607,1.12609,1.12607,1.12608,1.12609,1.12609,1.12602,1.12608
2019.04.09 00:12:00,1.12608,1.12608,1.12608,1.12608,1.12609,1.12609,1.12602,1.12608
2019.04.09 00:13:00,1.12608,1.12608,1.126,1.126,1.12608,1.12608,1.126,1.126
2019.04.09 00:14:00,1.126,1.126,1.1259,1.12592,1.12608,1.12608,1.126,1.126
2019.04.09 00:15:00,1.12592,1.12614,1.12592,1.12613,1.126,1.12614,1.1259,1.12613
2019.04.09 00:16:00,1.12614,1.12619,1.12612,1.12619,1.126,1.12614,1.1259,1.12613
2019.04.09 00:17:00,1.12619,1.1262,1.12615,1.12615,1.12614,1.1262,1.12612,1.12615
2019.04.09 00:18:00,1.12616,1.12616,1.12608,1.12608,1.12614,1.1262,1.12612,1.12615
2019.04.09 00:19:00,1.12608,1.12611,1.12608,1.12611,1.12616,1.12616,1.12608,1.12611
2019.04.09 00:20:00,1.12608,1.12637,1.12608,1.1262,1.12616,1.12616,1.12608,1.12611
2019.04.09 00:21:00,1.1262,1.12631,1.12614,1.12623,1.12608,1.12637,1.12608,1.12623
2019.04.09 00:22:00,1.12624,1.12646,1.12619,1.12626,1.12608,1.12637,1.12608,1.12623
2019.04.09 00:23:00,1.12626,1.12628,1.12585,1.1261,1.12624,1.12646,1.12585,1.1261
2019.04.09 00:24:00,1.12609,1.12621,1.12607,1.12616,1.12624,1.12646,1.12585,1.1261
2019.04.09 00:25:00,1.12615,1.12625,1.12609,1.1262,1.12609,1.12625,1.12607,1.1262
2019.04.09 00:26:00,1.1262,1.12629,1.12615,1.12618,1.12609,1.12625,1.12607,1.1262
2019.04.09 00:27:00,1.12618,1.12621,1.12614,1.12617,1.1262,1.12629,1.12614,1.12617
2019.04.09 00:28:00,1.12618,1.12618,1.12586,1.12587,1.1262,1.12629,1.12614,1.12617
2019.04.09 00:29:00,1.12586,1.12598,1.12586,1.12597,1.12618,1.12618,1.12586,1.12597
2019.04.09 00:30:00,1.12598,1.12608,1.12594,1.12604,1.12618,1.12618,1.12586,1.12597
我想知道如何在Pandas Dataframe中做到这一点:
Time,o1,h1,l1,c1,o2,h2,l2,c2,o11,h11,l11,c11
2019.04.08 23:59:00,1.12582,1.1261,1.12582,1.12609,1.12588,1.12591,1.12587,1.12588,1.12587,1.12598,1.12587,1.12598
2019.04.09 00:01:00,1.12599,1.12607,1.12587,1.12607,1.12598,1.12607,1.12587,1.12607,1.12607,1.12632,1.12605,1.12632
2019.04.09 00:03:00,1.12603,1.12614,1.12603,1.12613,1.12607,1.12632,1.12603,1.12613,1.12613,1.12613,1.12609,1.12611
2019.04.09 00:05:00,1.12612,1.12613,1.12612,1.12613,1.12613,1.12613,1.12609,1.12613,1.12613,1.12613,1.12613,1.12613
2019.04.09 00:07:00,1.12613,1.12613,1.1261,1.12612,1.12613,1.12613,1.1261,1.12612,1.12612,1.12612,1.12612,1.12612
2019.04.09 00:09:00,1.12612,1.12612,1.12609,1.12609,1.12612,1.12612,1.12609,1.12609,1.12609,1.12609,1.12602,1.12607
我想要将2分钟到1分钟的数据附加到上面的示例中。
请让我知道。
答案 0 :(得分:2)
将Index.get_indexer
用于列表中的列位置和DataFrame.iloc
中的合适行,最后DataFrame.join
一起:
pos2 = df.columns.get_indexer(['o1','h1','l1','c1'])
df1 = df.iloc[0::2].reset_index(drop=True)
df2 = df.iloc[1::2, pos2].reset_index(drop=True)
df = df1.join(df2, rsuffix='1')
print (df)
Time o1 h1 l1 c1 o2 h2 \
0 2019.04.08 23:59:00 1.12582 1.12610 1.12582 1.12609 1.12588 1.12591
1 2019.04.09 00:01:00 1.12599 1.12607 1.12587 1.12607 1.12598 1.12607
2 2019.04.09 00:03:00 1.12603 1.12614 1.12603 1.12613 1.12607 1.12632
3 2019.04.09 00:05:00 1.12612 1.12613 1.12612 1.12613 1.12613 1.12613
4 2019.04.09 00:07:00 1.12613 1.12613 1.12610 1.12612 1.12613 1.12613
5 2019.04.09 00:09:00 1.12612 1.12612 1.12609 1.12609 1.12612 1.12612
6 2019.04.09 00:11:00 1.12607 1.12609 1.12607 1.12608 1.12609 1.12609
7 2019.04.09 00:13:00 1.12608 1.12608 1.12600 1.12600 1.12608 1.12608
8 2019.04.09 00:15:00 1.12592 1.12614 1.12592 1.12613 1.12600 1.12614
9 2019.04.09 00:17:00 1.12619 1.12620 1.12615 1.12615 1.12614 1.12620
10 2019.04.09 00:19:00 1.12608 1.12611 1.12608 1.12611 1.12616 1.12616
11 2019.04.09 00:21:00 1.12620 1.12631 1.12614 1.12623 1.12608 1.12637
12 2019.04.09 00:23:00 1.12626 1.12628 1.12585 1.12610 1.12624 1.12646
13 2019.04.09 00:25:00 1.12615 1.12625 1.12609 1.12620 1.12609 1.12625
14 2019.04.09 00:27:00 1.12618 1.12621 1.12614 1.12617 1.12620 1.12629
15 2019.04.09 00:29:00 1.12586 1.12598 1.12586 1.12597 1.12618 1.12618
l2 c2 o11 h11 l11 c11
0 1.12587 1.12588 1.12587 1.12598 1.12587 1.12598
1 1.12587 1.12607 1.12607 1.12632 1.12605 1.12632
2 1.12603 1.12613 1.12613 1.12613 1.12609 1.12611
3 1.12609 1.12613 1.12613 1.12613 1.12613 1.12613
4 1.12610 1.12612 1.12612 1.12612 1.12612 1.12612
5 1.12609 1.12609 1.12609 1.12609 1.12602 1.12607
6 1.12602 1.12608 1.12608 1.12608 1.12608 1.12608
7 1.12600 1.12600 1.12600 1.12600 1.12590 1.12592
8 1.12590 1.12613 1.12614 1.12619 1.12612 1.12619
9 1.12612 1.12615 1.12616 1.12616 1.12608 1.12608
10 1.12608 1.12611 1.12608 1.12637 1.12608 1.12620
11 1.12608 1.12623 1.12624 1.12646 1.12619 1.12626
12 1.12585 1.12610 1.12609 1.12621 1.12607 1.12616
13 1.12607 1.12620 1.12620 1.12629 1.12615 1.12618
14 1.12614 1.12617 1.12618 1.12618 1.12586 1.12587
15 1.12586 1.12597 1.12598 1.12608 1.12594 1.12604
编辑:通过DataFrame.asfreq
以一分钟的频率添加缺失的行:
print (df)
Time o1 h1 l1 c1 o2 h2 \
0 2019.04.10 16:16:00 1.12308 1.12340 1.12304 1.12327 1.12350 1.12369
1 2019.04.10 16:17:00 1.12327 1.12333 1.12302 1.12322 1.12308 1.12340
2 2019.04.10 16:21:00 1.12314 1.12354 1.12313 1.12353 1.12314 1.12354
3 2019.04.10 16:22:00 1.12353 1.12362 1.12345 1.12359 1.12314 1.12354
4 2019.04.10 16:23:00 1.12359 1.12389 1.12357 1.12383 1.12353 1.12389
5 2019.04.10 16:24:00 1.12383 1.12396 1.12375 1.12384 1.12353 1.12389
6 2019.04.10 16:25:00 1.12383 1.12389 1.12373 1.12376 1.12383 1.12396
7 2019.04.10 16:26:00 1.12376 1.12386 1.12355 1.12361 1.12383 1.12396
l2 c2
0 1.12300 1.12308
1 1.12302 1.12322
2 1.12313 1.12353
3 1.12313 1.12353
4 1.12345 1.12383
5 1.12345 1.12383
6 1.12373 1.12376
7 1.12373 1.12376
df['Time'] = pd.to_datetime(df['Time'])
df = df.set_index('Time').asfreq('1min').reset_index()
#if need forward filling missing values
#df = df.set_index('Time').asfreq('1min', method='ffill').reset_index()
print (df)
Time o1 h1 l1 c1 o2 h2 \
0 2019-04-10 16:16:00 1.12308 1.12340 1.12304 1.12327 1.12350 1.12369
1 2019-04-10 16:17:00 1.12327 1.12333 1.12302 1.12322 1.12308 1.12340
2 2019-04-10 16:18:00 NaN NaN NaN NaN NaN NaN
3 2019-04-10 16:19:00 NaN NaN NaN NaN NaN NaN
4 2019-04-10 16:20:00 NaN NaN NaN NaN NaN NaN
5 2019-04-10 16:21:00 1.12314 1.12354 1.12313 1.12353 1.12314 1.12354
6 2019-04-10 16:22:00 1.12353 1.12362 1.12345 1.12359 1.12314 1.12354
7 2019-04-10 16:23:00 1.12359 1.12389 1.12357 1.12383 1.12353 1.12389
8 2019-04-10 16:24:00 1.12383 1.12396 1.12375 1.12384 1.12353 1.12389
9 2019-04-10 16:25:00 1.12383 1.12389 1.12373 1.12376 1.12383 1.12396
10 2019-04-10 16:26:00 1.12376 1.12386 1.12355 1.12361 1.12383 1.12396
l2 c2
0 1.12300 1.12308
1 1.12302 1.12322
2 NaN NaN
3 NaN NaN
4 NaN NaN
5 1.12313 1.12353
6 1.12313 1.12353
7 1.12345 1.12383
8 1.12345 1.12383
9 1.12373 1.12376
10 1.12373 1.12376
pos2 = df.columns.get_indexer(['o1','h1','l1','c1'])
df1 = df.iloc[0::2].reset_index(drop=True)
df2 = df.iloc[1::2, pos2].reset_index(drop=True)
df = df1.join(df2, rsuffix='1')
print (df)
Time o1 h1 l1 c1 o2 h2 \
0 2019-04-10 16:16:00 1.12308 1.12340 1.12304 1.12327 1.12350 1.12369
1 2019-04-10 16:18:00 NaN NaN NaN NaN NaN NaN
2 2019-04-10 16:20:00 NaN NaN NaN NaN NaN NaN
3 2019-04-10 16:22:00 1.12353 1.12362 1.12345 1.12359 1.12314 1.12354
4 2019-04-10 16:24:00 1.12383 1.12396 1.12375 1.12384 1.12353 1.12389
5 2019-04-10 16:26:00 1.12376 1.12386 1.12355 1.12361 1.12383 1.12396
l2 c2 o11 h11 l11 c11
0 1.12300 1.12308 1.12327 1.12333 1.12302 1.12322
1 NaN NaN NaN NaN NaN NaN
2 NaN NaN 1.12314 1.12354 1.12313 1.12353
3 1.12313 1.12353 1.12359 1.12389 1.12357 1.12383
4 1.12345 1.12383 1.12383 1.12389 1.12373 1.12376
5 1.12373 1.12376 NaN NaN NaN NaN
答案 1 :(得分:0)
分割数据然后加入:
# split data every other row
n = df[0::2]
n1 = df[1::2]
# reset_index and join
new_df = n.reset_index(drop=True).join(n1.reset_index(drop=True), rsuffix='_other')
然后删除您不想要的列