在Pandas数据框中有10分钟的风向和速度数据。看起来像这样:
year month day hour minutes direction speed filename
0 1999.0 1 1 0 0 84.0 7.1 mlrf1c1999.txt
1 1999.0 1 1 0 10 75.0 7.5 mlrf1c1999.txt
2 1999.0 1 1 0 20 79.0 7.2 mlrf1c1999.txt
3 1999.0 1 1 0 30 77.0 7.2 mlrf1c1999.txt
4 1999.0 1 1 0 40 76.0 6.7 mlrf1c1999.txt
5 1999.0 1 1 0 50 76.0 7.5 mlrf1c1999.txt
6 1999.0 1 1 1 0 81.0 6.9 mlrf1c1999.txt
7 1999.0 1 1 1 10 75.0 7.3 mlrf1c1999.txt
8 1999.0 1 1 1 20 77.0 7.4 mlrf1c1999.txt
9 1999.0 1 1 1 30 73.0 6.9 mlrf1c1999.txt
10 1999.0 1 1 1 40 78.0 6.5 mlrf1c1999.txt
11 1999.0 1 1 1 50 75.0 7.3 mlrf1c1999.txt
...
1147812 1997.0 12 31 21 0 261.0 6.0 mlrf1c1997.txt
1147813 1997.0 12 31 21 10 260.0 5.9 mlrf1c1997.txt
1147814 1997.0 12 31 21 20 262.0 5.5 mlrf1c1997.txt
1147815 1997.0 12 31 21 30 279.0 6.5 mlrf1c1997.txt
1147816 1997.0 12 31 21 40 283.0 7.3 mlrf1c1997.txt
1147817 1997.0 12 31 21 50 282.0 7.2 mlrf1c1997.txt
1147818 1997.0 12 31 22 0 277.0 6.9 mlrf1c1997.txt
1147819 1997.0 12 31 22 10 283.0 7.6 mlrf1c1997.txt
1147820 1997.0 12 31 22 20 283.0 7.2 mlrf1c1997.txt
1147821 1997.0 12 31 22 30 290.0 7.5 mlrf1c1997.txt
1147822 1997.0 12 31 22 40 289.0 7.2 mlrf1c1997.txt
1147823 1997.0 12 31 22 50 292.0 7.6 mlrf1c1997.txt
1147824 1997.0 12 31 23 0 296.0 7.7 mlrf1c1997.txt
我正在尝试使用数据透视表检查数据,以便我可以每小时获取平均方向和速度。我需要将Scipy的circmean函数应用于定向数据。这需要为数据集指定高和低参数。当我尝试这样做时,出现TypeError:'numpy.float64'对象不可调用。
df.pivot_table(values = ['direction'], index = ['day', 'hour'], aggfunc = circmean(df.direction, high=df.direction.max(), low=df.direction.min()))
df.pivot_table(values = ['direction'], index = ['day', 'hour'], aggfunc = circmean(df.direction, high=360, low=0))
据我了解,circmean需要高低的参数才能获得准确的输出。当我使用np.mean尝试获取风速读数的平均值时,我没有困难:
df.pivot_table(values = ['speed'], index = ['day', 'hour'], aggfunc = np.mean)
哪种产量:
speed
day hour
1 0 6.085055
1 6.144919
2 6.253006
3 6.315291
4 6.305656
5 6.241176
6 6.205701
我也可以不带参数应用circmean函数,就像这样:
df.pivot_table(values = ['direction'], index = ['day', 'hour'], aggfunc = circmean)
执行此操作时,会得到无法解释的结果(即它们不是360度):
direction
day hour
1 0 2.992024
1 3.414254
2 1.620715
3 0.463309
4 6.206874
5 1.451950
6 4.319550
有没有办法在数据透视表的aggfunc参数中应用函数和参数?如果没有,是否有人建议我如何从数据框中获取所需的通告?
答案 0 :(得分:0)
以下是一些复制您问题的代码:
import io
import pandas as pd
from scipy.stats import circmean
doc = """ year month day hour minutes direction speed filename
0 1999.0 1 1 0 0 84.0 7.1 mlrf1c1999.txt
1 1999.0 1 1 0 10 75.0 7.5 mlrf1c1999.txt
2 1999.0 1 1 0 20 79.0 7.2 mlrf1c1999.txt
3 1999.0 1 1 0 30 77.0 7.2 mlrf1c1999.txt
4 1999.0 1 1 0 40 76.0 6.7 mlrf1c1999.txt
5 1999.0 1 1 0 50 76.0 7.5 mlrf1c1999.txt
6 1999.0 1 1 1 0 81.0 6.9 mlrf1c1999.txt
7 1999.0 1 1 1 10 75.0 7.3 mlrf1c1999.txt
8 1999.0 1 1 1 20 77.0 7.4 mlrf1c1999.txt
9 1999.0 1 1 1 30 73.0 6.9 mlrf1c1999.txt
10 1999.0 1 1 1 40 78.0 6.5 mlrf1c1999.txt
11 1999.0 1 1 1 50 75.0 7.3 mlrf1c1999.txt
1147812 1997.0 12 31 21 0 261.0 6.0 mlrf1c1997.txt
1147813 1997.0 12 31 21 10 260.0 5.9 mlrf1c1997.txt
1147814 1997.0 12 31 21 20 262.0 5.5 mlrf1c1997.txt
1147815 1997.0 12 31 21 30 279.0 6.5 mlrf1c1997.txt
1147816 1997.0 12 31 21 40 283.0 7.3 mlrf1c1997.txt
1147817 1997.0 12 31 21 50 282.0 7.2 mlrf1c1997.txt
1147818 1997.0 12 31 22 0 277.0 6.9 mlrf1c1997.txt
1147819 1997.0 12 31 22 10 283.0 7.6 mlrf1c1997.txt
1147820 1997.0 12 31 22 20 283.0 7.2 mlrf1c1997.txt
1147821 1997.0 12 31 22 30 290.0 7.5 mlrf1c1997.txt
1147822 1997.0 12 31 22 40 289.0 7.2 mlrf1c1997.txt
1147823 1997.0 12 31 22 50 292.0 7.6 mlrf1c1997.txt
1147824 1997.0 12 31 23 0 296.0 7.7 mlrf1c1997.txt"""
df = pd.read_csv(io.StringIO(doc), sep='\s+')
脾气暴躁的笔记:在一个更好的问题中,上面的代码可能有问题, 花费了一些不必要的练习和时间来复制答案。 有关详细信息,请参见https://stackoverflow.com/help/mcve。
# Now you need a function accepting an arguement for `aggfunc`
def avg(x):
# x will be a pd.Series, equalling df.direction
return circmean(x, high=x.max(), low=x.min())
# just to learn how it works with 'mean'
df2 = df.pivot_table(values='direction', index=['day', 'hour'], aggfunc = 'mean')
# now putting the desired function
df3 = df.pivot_table(values='direction', index=['day', 'hour'], aggfunc = avg)
有一个警告,但我希望您知道要处理(也许您想转换
avg
内的弧度度:
RuntimeWarning:true_divide中遇到无效的值 ang =(样本-低)* 2 * pi /(高-低)
希望有帮助。