我正在运行满足概述对象下方的代码,但是遇到一个错误,我不确定如何解决。
class variableTreatment():
def drop_zero_car_col(self, df):
numerical = list(df._get_numeric_data().columns)
categorical = list(set(df.columns).difference(set(numerical)))
ls = []
for i in categorical:
d = dict(i.value_counts())
if len(d)==1:
ls.append(i)
df.drop(ls,axis=1,inplace=True)
return(df)
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler,StandardScaler,LabelEncoder
df = pd.read_excel('CKD.xlsx')
VT = variableTreatment()
VT
VT.drop_zero_car_col(df).head()
要执行此操作: 删除具有相同级别的分类列,例如具有所有“是”值的列
输入: 1.数据框df:熊猫数据框
输出: 1.带有删除列的数据框df(如果未删除任何列,则将返回相同的数据框)
但是我遇到了这个错误:
AttributeError Traceback (most recent call last)
<ipython-input-10-e04a1da339fd> in <module>
----> 1 VT.drop_zero_car_col(df).head()
<ipython-input-3-64cf5361fc06> in drop_zero_car_col(self, df)
54 ls = []
55 for i in categorical:
---> 56 d = dict(i.value_counts())
57 if len(d)==1:
58 ls.append(i)
AttributeError: 'str' object has no attribute 'value_counts'```
答案 0 :(得分:1)
要在数据框中删除具有相同值的非数字列,可以按如下所示更改函数:
class variableTreatment():
def drop_zero_car_col(self, df):
# selecting numerical columns without accessing private method
numerical = list(df.select_dtypes([np.number]).columns)
categorical = list(set(df.columns).difference(set(numerical)))
# after above line categorical will have only non numeric columns
ls = []
for i in categorical:
# value_counts will return series containing count of non unique values
d = df[i].value_counts()
# if series length is 1 then it means there is only one value in column that is duplicated across all rows so this columns should be dopped
if len(d)==1:
ls.append(i)
df.drop(ls,axis=1,inplace=True)
return(df)