我的数据集是维度的数据框架(840,84)。当我写代码时:
ds[ds.columns[1]].value_counts()
我得到了正确的输出:
Out[82]:
0 847
1 5
Name: o_East, dtype: int64
但是当我写一个循环来存储值时,我得到'DataFrame'对象没有属性'value_counts'。我无法解释为什么......
wind_vec = []
wind_vec = [(ds[x].value_counts()) for x in ds.columns]
更新代码
import pandas as pd
import numpy as np
import numpy.ma as ma
import statsmodels.api as sm
import matplotlib
import matplotlib.pyplot as plt
from sklearn.preprocessing import OneHotEncoder
dataset = pd.read_csv('data/dataset.csv')
ds = dataset
o_wdire = pd.get_dummies(ds['o_wdire'])
s_wdire = pd.get_dummies(ds['s_wdire'])
t_wdire = pd.get_dummies(ds['t_wdire'])
k_wdire = pd.get_dummies(ds['k_wdire'])
b_wdire = pd.get_dummies(ds['b_wdire'])
o_wdire.rename(columns={'ENE': 'o_ENE','ESE': 'o_ESE', 'East': 'o_East', 'NE': 'o_NE', 'NNE': 'o_NNE', 'NNW': 'o_NNW', \
'NW': 'o_NW', 'North': 'o_North', 'SE': 'o_SE', 'SSE': 'o_SSE', 'SSW': 'o_SSW', 'SW': 'o_SW', \
'South': 'o_South', 'Variable': 'o_Variable', 'WSW': 'o_WSW','West':'o_West'}, inplace=True)
s_wdire.rename(columns={'ENE': 's_ENE','ESE': 's_ESE', 'East': 's_East', 'NE': 's_NE', 'NNE': 's_NNE', 'NNW': 's_NNW', \
'NW': 's_NW', 'North': 's_North', 'SE': 's_SE', 'SSE': 's_SSE', 'SSW': 's_SSW', 'SW': 's_SW', \
'South': 's_South', 'Variable': 's_Variable', 'West': 's_West','WSW': 's_WSW'}, inplace=True)
k_wdire.rename(columns={'ENE': 'k_ENE','ESE': 'k_ESE', 'East': 'k_East', 'NE': 'k_NE', 'NNE': 'k_NNE', 'NNW': 'k_NNW', \
'NW': 'k_NW', 'North': 'k_North', 'SE': 'k_SE', 'SSE': 'k_SSE', 'SSW': 'k_SSW', 'SW': 'k_SW', \
'South': 'k_South', 'Variable': 'k_Variable', 'WNW': 'k_WNW', 'West': 'k_West','WSW': 'k_WSW'}, inplace=True)
b_wdire.rename(columns={'ENE': 'b_ENE','ESE': 'b_ESE', 'East': 'b_East', 'NE': 'b_NE', 'NNE': 'b_NNE', 'NNW': 'b_NNW', \
'NW': 'b_NW', 'North': 'b_North', 'SE': 'b_SE', 'SSE': 'b_SSE', 'SSW': 'b_SSW', 'SW': 'b_SW', \
'South': 'b_South', 'Variable': 'b_Variable', 'WSW': 'b_WSW', 'WNW': 'b_WNW', 'West': 'b_West'}, inplace=True)
t_wdire.rename(columns={'ENE': 't_ENE','ESE': 't_ESE', 'East': 't_East', 'NE': 't_NE', 'NNE': 't_NNE', 'NNW': 't_NNW', \
'NW': 't_NW', 'North': 't_North', 'SE': 't_SE', 'SSE': 't_SSE', 'SSW': 't_SSW', 'SW': 't_SW', \
'South': 't_South', 'Variable': 't_Variable', 'WSW': 't_WSW', 'WNW': 't_WNW', 'West':'t_West'}, inplace=True)
#WIND
ds_wdire = pd.DataFrame(pd.concat([o_wdire,s_wdire,t_wdire,k_wdire,b_wdire],axis=1))
ds_wdire = ds_wdire.astype('float64')
In [93]: ds_wdire.shape
Out[93]: (852, 84)
In[101]: ds_wdire[ds_wdire.columns[0]].head()
Out[101]:
0 0
1 0
2 0
3 0
4 0
Name: o_ENE, dtype: float64
In[103]: ds_wdire[ds_wdire.columns[0]].value_counts()
Out[103]:
0 838
1 14
Name: o_ENE, dtype: int64
In[104]: [ds_wdire[x].value_counts() for x in ds_wdire.columns]
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-104-d9756c468818> in <module>()
1 #Filtering for the wind direction based on the most frequent ones.
----> 2 [ds_wdire[x].value_counts() for x in ds_wdire.columns]
<ipython-input-104-d9756c468818> in <listcomp>(.0)
1 #Filtering for the wind direction based on the most frequent ones.
----> 2 [ds_wdire[x].value_counts() for x in ds_wdire.columns]
/home/florian/anaconda3/lib/python3.5/site-packages/pandas/core/generic.py in __getattr__(self, name)
2358 return self[name]
2359 raise AttributeError("'%s' object has no attribute '%s'" %
-> 2360 (type(self).__name__, name))
2361
2362 def __setattr__(self, name, value):
AttributeError: 'DataFrame' object has no attribute 'value_counts'
答案 0 :(得分:1)
感谢@EdChum的建议,我查了一下:
len(ds_wdire.columns),len(ds_wdire.columns.unique())
Out[100]: (83,84)
实际上,dict中缺少一个应该从'WNW'修改为'o_WNW'的名称值。:
o_wdire.rename(columns={'ENE': 'o_ENE','ESE': 'o_ESE', 'East': 'o_East', 'NE': 'o_NE', 'NNE': 'o_NNE', 'NNW': 'o_NNW', \
'NW': 'o_NW', 'North': 'o_North', 'SE': 'o_SE', 'SSE': 'o_SSE', 'SSW': 'o_SSW', 'SW': 'o_SW', \
'South': 'o_South', 'Variable': 'o_Variable', 'WSW': 'o_WSW','West':'o_West', **[MISSING VALUE WNW]**}, inplace=True)
也许最好编写一个在风向变量上插入前缀的循环,这样一来,我会避免这种问题。