我看到了有些带有相同错误代码的类似问题,他们提到“属性”和“设置者”的必要性,但我认为这不适用/最简单的方法解决我的特殊问题。
.merge
,.transform
,lambda
甚至是简单的=
赋值来简化解决方案... AttributeError: can't set attribute
” DataFrame:
sample_data = [['USA', 'gdp', 2001, 10],['USA', 'avgIQ', 2001, 100],['USA', 'people', 2001, 1000],['USA', 'dragons', 2001, 3],['CHN', 'gdp', 2001, 12], ['CHN', 'avgIQ', 2001, 120],['CHN', 'people', 2001, 2000],['CHN', 'dragons', 2001, 1],['RUS', 'gdp', 2001, 11],['RUS', 'avgIQ', 2001, 105], ['RUS', 'people', 2001, 1500],['RUS', 'dragons', 2001, np.nan],['USA', 'gdp', 2002, 12],['USA', 'avgIQ', 2002, 105],['USA', 'people', 2002, 1200], ['USA', 'dragons', 2002, np.nan],['CHN', 'gdp', 2002, 14],['CHN', 'avgIQ', 2002, 127],['CHN', 'people', 2002, 3100],['CHN', 'dragons', 2002, 4], ['RUS', 'gdp', 2002, 11],['RUS', 'avgIQ', 2002, 99],['RUS', 'people', 2002, 1600],['RUS', 'dragons', 2002, np.nan],['USA', 'gdp', 2003, 15], ['USA', 'avgIQ', 2003, 115],['USA', 'people', 2003, 2000],['USA', 'dragons', 2003, np.nan],['CHN', 'gdp', 2003, 16],['CHN', 'avgIQ', 2003, 132], ['CHN', 'people', 2003, 4000],['CHN', 'dragons', 2003, 6],['RUS', 'gdp', 2003, 11],['RUS', 'avgIQ', 2003, 108],['RUS', 'people', 2003, 2000], ['RUS', 'dragons', 2003, np.nan],['USA', 'gdp', 2004, 18],['USA', 'avgIQ', 2004, 111],['USA', 'people', 2004, 2500],['USA', 'dragons', 2004, np.nan], ['CHN', 'gdp', 2004, 18],['CHN', 'avgIQ', 2004, 140],['CHN', 'people', 2004, np.nan],['CHN', 'dragons', 2004, np.nan], ['RUS', 'gdp', 2004, 15],['RUS', 'avgIQ', 2004, 103],['RUS', 'people', 2004, 2800],['RUS', 'dragons', 2004, np.nan], ['USA', 'gdp', 2005, 23],['USA', 'avgIQ', 2005, 111],['USA', 'people', 2005, 3700],['USA', 'dragons', 2005, 8],['CHN', 'gdp', 2005, 22], ['CHN', 'avgIQ', 2005, 143],['CHN', 'people', 2005, 6000],['CHN', 'dragons', 2005, 15],['RUS', 'gdp', 2005, 17],['RUS', 'avgIQ', 2005, np.nan], ['RUS', 'people', 2005, 3000],['RUS', 'dragons', 2005, 3]]
sample_df = pd.DataFrame(sample_data, columns = ['A','B','C','D'])
sample_df['C'] = sample_df['C'].astype(float)
sample_df.head()
Data columns (total 4 columns):
A 60 non-null object
B 60 non-null object
C 60 non-null float64
D 50 non-null float64
dtypes: float64(2), object(2)
from impyute.imputation.cs import mice
循环中的以下行是问题所在:
group['D'].values = ((mice(group.apply({'C': lambda x: x.values, 'D': lambda y: y.values})))[1]).values
请注意我在输出中放置的主题标签。
for group_index, group in sample_group:
if group.isnull().values.any() == True:
print(group)
print(group['D'].values)
print(mice(group.apply({'C': lambda x: x.values, 'D': lambda y: y.values})))
print((mice(group.apply({'C': lambda x: x.values, 'D': lambda y: y.values})))[1])
print(((mice(group.apply({'C': lambda x: x.values, 'D': lambda y: y.values})))[1]).values)
group['D'].values = ((mice(group.apply({'C': lambda x: x.values, 'D': lambda y: y.values})))[1]).values
print(group)
else:
print('Checked group but could not satisfy condition', group_index)
Checked group but could not satisfy condition ('CHN', 'avgIQ') #Does not have any nan values
A B C D
7 CHN dragons 2,001.00 1.00
19 CHN dragons 2,002.00 4.00
31 CHN dragons 2,003.00 6.00
43 CHN dragons 2,004.00 nan #Prints group because it has nan
55 CHN dragons 2,005.00 15.00
[ 1. 4. 6. nan 15.] #Prints values of 'D'
0 1
0 2,001.00 1.00
1 2,002.00 4.00
2 2,003.00 6.00
3 2,004.00 10.86 #Imputes the nan value and prints
4 2,005.00 15.00
0 1.00
1 4.00
2 6.00
3 10.86 #Prints only the column with the new imputed value
4 15.00
Name: 1, dtype: float64
[ 1. 4. 6. 10.85714286 15. ] #Prints the new values for the column
**AttributeError: can't set attribute**
A B C D
7 CHN dragons 2,001.00 1.00
19 CHN dragons 2,002.00 4.00
31 CHN dragons 2,003.00 6.00
43 CHN dragons 2,004.00 10.86 #Replace the original 'D' column for that group, with the new value(s)
55 CHN dragons 2,005.00 15.00
最终,我将要创建一个新的df,其中包含所有没有nan的原始组,以及已推算nan的更新组。
答案 0 :(得分:0)
尝试一下:
group['D'] = ((mice(group.apply({'C': lambda x: x.values, 'D': lambda y: y.values})))[1]).values