如何修复pandas groupby中的错误:没有要聚合的数字类型

时间:2013-11-09 08:59:34

标签: python pandas

我的代码有问题,试图计算许多变量的统计信息。关于这段代码的奇怪之处在于运行一些数据库而不是其他数据库。数据库具有相同的特征,只需更改每个数据库的月份。

data = read_csv('/home/mcidas/Escritorio/estadisticas-cea/linux/2011/datos/emas/amealco/enero.csv',skiprows=1,names=['Fecha','Hora','C','D','E','Temperatura','TempRocio','DirViento','I','MagViento','K','Humedad','Presion','N','PreciAcu','P','Q','R','S'],header=0)


direccion=[]
for i in data['DirViento']:
 if i=='SSW':
     dir=202.5
 elif i=='S':
     dir=180.0
 elif i=='N':
     dir=360.0
 elif i=='NNE':
     dir=22.5
 elif i=='NE':
     dir=45.0
 elif i=='ENE':
     dir=67.5
 elif i=='E':
     dir=90.0
 elif i=='ESE':
     dir=112.5
 elif i=='SE':
     dir=135.0
 elif i=='SSE':
     dir=157.5
 elif i=='SW':
     dir=225.0
 elif i=='WSW':
     dir=247.5
 elif i=='W':
     dir=270.0
 elif i=='WNW':
     dir=292.5
 elif i=='NW':
     dir=315.0
 elif i=='NNW':
     dir=337.5
 else:
     dir=np.nan
 direccion.append(dir)
data['DirViento']=direccion

Uviento=[]
Vviento=[]

for i in range(0,len(data['MagViento'])):
   Uviento.append((data['MagViento'][i]*sin((data['DirViento'][i]+180)*(pi/180.0))))
   Vviento.append((data['MagViento'][i]*cos((data['DirViento'][i]+180)*(pi/180.0))))

data['PromeU']=Uviento
data['PromeV']=Vviento

index=data.set_index(['Fecha','Hora'])
g = index.groupby(level=0)
stat_cea_mean = g.agg({'PromeU':np.mean,'PromeV':np.mean,'Temperatura':np.mean,'TempRocio':np.mean,'Humedad':np.mean,'Presion':np.mean,'PreciAcu':np.sum}) 

上面的代码运行成功但是当我更改月份(至8月)并使用相同的代码时,我收到以下错误:

首先,当我运行相同的代码时,我遇到了问题

index=data.set_index(['Fecha','Hora'])
g = index.groupby(level=0)

IndexError: index out of range for array

当我尝试更改索引

index=data.set_index(['Fecha','Hora'])
g = data.groupby(level=0)
stat_cea_mean = g.agg({'PromeU':np.mean,'PromeV':np.mean,'Temperatura':np.mean,'TempRocio':np.mean,'Humedad':np.mean,'Presion':np.mean,'PreciAcu':np.sum})= data.groupby(level=0)

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python2.6/site-packages/pandas-0.9.1-py2.6-linux-x86_64.egg/pandas/core/groupby.py", line 304, in agg return self.aggregate(func, *args, **kwargs)
File "/usr/lib64/python2.6/site-packages/pandas-0.9.1-py2.6-linux-x86_64.egg/pandas/core/groupby.py", line 1573, in aggregate result[col] = colg.aggregate(agg_how)
File "/usr/lib64/python2.6/site-packages/pandas-0.9.1-py2.6-linux-x86_64.egg/pandas/core/groupby.py", line 1301, in aggregate return getattr(self, cyfunc)()
File "/usr/lib64/python2.6/site-packages/pandas-0.9.1-py2.6-linux-x86_64.egg/pandas/core/groupby.py", line 319, in mean return self._cython_agg_general('mean')
File "/usr/lib64/python2.6/site-packages/pandas-0.9.1-py2.6-linux-x86_64.egg/pandas/core/groupby.py", line 408, in _cython_agg_general raise DataError('No numeric types to aggregate')
pandas.core.groupby.DataError: No numeric types to aggregate

enero的数据框

Data columns:
Fecha          4464  non-null values
Hora           4464  non-null values
C              4464  non-null values
D              4464  non-null values
E              4464  non-null values
Temperatura    4464  non-null values
TempRocio      4464  non-null values
DirViento      4464  non-null values
I              4464  non-null values
MagViento      4464  non-null values
K              4464  non-null values
Humedad        4464  non-null values
Presion        4464  non-null values
N              4464  non-null values
PreciAcu       4464  non-null values
P              4464  non-null values
Q              4464  non-null values
R              4464  non-null values
S              4464  non-null values
dtypes: float64(8), int64(4), object(7)

 data['DirViento']
 0     SSW
 1     SSW
 2     SSW
 3     SSW
 4     SSW
 5      SW
 6     SSW
 7      SW
 8      SW
 9      SW
10    SSW
11    SSW
12    SSW
13    SSW
14    SSW 
...
4449     SW
4450    SSW
4451    SSW
4452     SW
4453     SW
4454     SW
4455     SW  
4456     SW
4457     SW
4458     SW
4459     SW
4460    SSW
4461     SW
4462     SW
4463     SW
Name: DirViento, Length: 4464

agosto的数据框

数据栏:

Fecha          3703  non-null values
Hora           3703  non-null values
C              3703  non-null values
D              3703  non-null values
E              3703  non-null values
Temperatura    3703  non-null values
TempRocio      3703  non-null values
DirViento      3703  non-null values
I              3703  non-null values
MagViento      3703  non-null values
K              3703  non-null values
Humedad        3703  non-null values
Presion        3703  non-null values
N              3703  non-null values
PreciAcu       3703  non-null values
P              3703  non-null values
Q              3703  non-null values
R              3703  non-null values
S              3703  non-null values
dtypes: float64(7), object(12)

data['DirViento']
0     ENE
1       E
2     ENE
3     ENE
4       E
5       E
6       E
7       E
8       E
9       E
10      E
11      E
12      E
13      E
14    ESE
...
3689    ---
3690    ---
3691    ---
3692    ---
3693    ---
3694    ---
3695    ---
3696    ---
3697    ---
3698    ---
3699    ---
3700    ---
3701    ---
3702    ---
3703    NaN
Name: DirViento, Length: 3704

道歉请求这么多,但我真的想学习

0 个答案:

没有答案