如果有更好的方法来组合两个数据帧,那么我就会徘徊。
import pandas as pd
#create ramdom data sets
N = 50
df = pd.DataFrame({'date': pd.date_range('2000-1-1', periods=N, freq='H'),
'value': np.random.random(N)})
index = pd.DatetimeIndex(df['date'])
peak_time = df.iloc[index.indexer_between_time('7:00','9:00')]
lunch_time = df.iloc[index.indexer_between_time('12:00','14:00')]
comb_data = pd.concat([peak_time, lunch_time], ignore_index=True)
使用逻辑运算符使用between_time时,有没有办法组合两个范围?
我必须使用它来在df中创建一个名为'isPeak'的新列,其中当它在7:00~9:00和12:00~14:00之间的范围内写入1时,如果不是,则写入0。 / p>
答案 0 :(得分:3)
对我工作np.union1d
:
import numpy as np
idx = np.union1d(index.indexer_between_time('7:00','9:00'),
index.indexer_between_time('12:00','14:00'))
comb_data = df.iloc[idx]
print (comb_data)
date value
7 2000-01-01 07:00:00 0.760627
8 2000-01-01 08:00:00 0.236474
9 2000-01-01 09:00:00 0.626146
12 2000-01-01 12:00:00 0.625335
13 2000-01-01 13:00:00 0.793105
14 2000-01-01 14:00:00 0.706873
31 2000-01-02 07:00:00 0.113688
32 2000-01-02 08:00:00 0.035565
33 2000-01-02 09:00:00 0.230603
36 2000-01-02 12:00:00 0.423155
37 2000-01-02 13:00:00 0.947584
38 2000-01-02 14:00:00 0.226181
替代numpy.r_
:
idx = np.r_[index.indexer_between_time('7:00','9:00'),
index.indexer_between_time('12:00','14:00')]
comb_data = df.iloc[idx]
print (comb_data)
date value
7 2000-01-01 07:00:00 0.760627
8 2000-01-01 08:00:00 0.236474
9 2000-01-01 09:00:00 0.626146
31 2000-01-02 07:00:00 0.113688
32 2000-01-02 08:00:00 0.035565
33 2000-01-02 09:00:00 0.230603
12 2000-01-01 12:00:00 0.625335
13 2000-01-01 13:00:00 0.793105
14 2000-01-01 14:00:00 0.706873
36 2000-01-02 12:00:00 0.423155
37 2000-01-02 13:00:00 0.947584
38 2000-01-02 14:00:00 0.226181
使用Index.union
的纯pandas解决方案并将数组转换为index
:
idx = (pd.Index(index.indexer_between_time('7:00','9:00'))
.union(pd.Index(index.indexer_between_time('12:00','14:00'))))
comb_data = df.iloc[idx]
print (comb_data)
date value
7 2000-01-01 07:00:00 0.760627
8 2000-01-01 08:00:00 0.236474
9 2000-01-01 09:00:00 0.626146
12 2000-01-01 12:00:00 0.625335
13 2000-01-01 13:00:00 0.793105
14 2000-01-01 14:00:00 0.706873
31 2000-01-02 07:00:00 0.113688
32 2000-01-02 08:00:00 0.035565
33 2000-01-02 09:00:00 0.230603
36 2000-01-02 12:00:00 0.423155
37 2000-01-02 13:00:00 0.947584
38 2000-01-02 14:00:00 0.226181