Value_counts()AttributeError:'str'对象没有属性'value_counts'

时间:2019-11-14 02:23:25

标签: python python-3.x pandas dataframe

我正在运行满足概述对象下方的代码,但是遇到一个错误,我不确定如何解决。

class variableTreatment():
         def drop_zero_car_col(self, df):
            numerical = list(df._get_numeric_data().columns)
            categorical = list(set(df.columns).difference(set(numerical)))
            ls = []
            for i in categorical:
                d = dict(i.value_counts())
                if len(d)==1:
                    ls.append(i)
            df.drop(ls,axis=1,inplace=True)
            return(df)

import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler,StandardScaler,LabelEncoder
df = pd.read_excel('CKD.xlsx')

VT = variableTreatment()

VT

VT.drop_zero_car_col(df).head()

要执行此操作: 删除具有相同级别的分类列,例如具有所有“是”值的列

输入: 1.数据框df:熊猫数据框

输出: 1.带有删除列的数据框df(如果未删除任何列,则将返回相同的数据框)

但是我遇到了这个错误:

AttributeError                            Traceback (most recent call last)
<ipython-input-10-e04a1da339fd> in <module>
----> 1 VT.drop_zero_car_col(df).head()

<ipython-input-3-64cf5361fc06> in drop_zero_car_col(self, df)
     54         ls = []
     55         for i in categorical:
---> 56             d = dict(i.value_counts())
     57             if len(d)==1:
     58                 ls.append(i)

AttributeError: 'str' object has no attribute 'value_counts'```

1 个答案:

答案 0 :(得分:1)

要在数据框中删除具有相同值的非数字列,可以按如下所示更改函数:

class variableTreatment():
         def drop_zero_car_col(self, df):
            # selecting numerical columns without accessing private method
            numerical = list(df.select_dtypes([np.number]).columns)
            categorical = list(set(df.columns).difference(set(numerical)))
            # after above line categorical will have only non numeric columns
            ls = []
            for i in categorical:
                # value_counts will return series containing count of non unique values 
                d = df[i].value_counts()    
                # if series length is 1 then it means there is only one value in column that is duplicated across all rows so this columns should be dopped                
                if len(d)==1:
                    ls.append(i)
            df.drop(ls,axis=1,inplace=True)
            return(df)