YYYY MM DD HH DIR VEL
1990 1 1 1 112.0 4.1
1990 1 1 2 121.0 3.6
1990 1 1 3 27.0 3.1
1990 1 1 4 48.0 2.1
1990 1 1 5 129.0 2.6
1990 1 1 6 61.0 1.1
1990 1 1 7 78.0 3.1
1990 1 1 8 12.0 1.6
1990 1 1 9 37.0 2.6
我想指定风速(标题“ VEL”)和风向(“ DIR”)的间隔。然后,我想创建一个新的数据框,以计算这些间隔中的频率。
dir_interval = [0,1,2,3,4,5]
vel_interval = [0,60,120,180,240,300,360]
0-1 1-2 2-3 3-4 4-5
0 - 60 2 1
60 - 120 1 1 1 1 1
120 - 180 1 1
180 - 240
240 - 300
300 - 360
答案 0 :(得分:4)
您似乎搞砸了dir_interval
和vel_interval
。也就是说,我认为您正在寻找crosstab
和cut
:
pd.crosstab(pd.cut(df['DIR'], vel_interval),
pd.cut(df['VEL'], dir_interval))
输出:
VEL (1, 2] (2, 3] (3, 4] (4, 5]
DIR
(0, 60] 1 2 1 0
(60, 120] 1 0 1 1
(120, 180] 0 1 1 0
更新:np.histogram2d
是一个不错的选择,但更为冗长:
hist, _, _ = np.histogram2d(x=df['DIR'], y=df['VEL'],
bins = (vel_interval, dir_interval))
out = pd.DataFrame(hist.astype(int),
index=[f'{x}-{y}' for x,y in zip(vel_interval[:-1],
vel_interval[1:])],
columns=[f'{x}-{y}' for x,y in zip(dir_interval[:-1],
dir_interval[1:])]
)
输出:
0-1 1-2 2-3 3-4 4-5
0-60 0 1 2 1 0
60-120 0 1 0 1 1
120-180 0 0 1 1 0
180-240 0 0 0 0 0
240-300 0 0 0 0 0
300-360 0 0 0 0 0
答案 1 :(得分:0)
另一种方式
vel_interval = [0,60,120,180,240,300,360]
dir_interval = [0,1,2,3,4,5]
s=(pd.cut(df.DIR, vel_interval,list(np.arange(120,360,60))))\
.astype(str).str.replace('\(|\]','').str.replace('\,\s','-')#Bin Direction
h=(pd.cut(df.VEL, dir_interval,list(np.arange(1,6,1)))).\
astype(str).str.replace('\(|\]','').str.replace('\,\s','-')#Bin velocity
newindex=pd.Series(vel_interval).astype(str).transform(lambda x: x.shift()+'-'+ x).dropna()#New index
pd.crosstab(s,h).rename_axis(index=None, columns=None).reindex(newindex).fillna('0')#Tabulate result
1-2 2-3 3-4 4-5
0-60 1 2 1 0
60-120 1 0 1 1
120-180 0 1 1 0
180-240 0 0 0 0
240-300 0 0 0 0
300-360 0 0 0 0