Question

我试图复制此处提供的代码： https://github.com/IdoZehori/Credit-Score/blob/master/Credit%20score.ipynb

下面给出的函数无法运行并给出错误。有人可以帮我解决吗

def replaceOutlier(data, method = outlierVote, replace='median'):
'''replace: median (auto)
            'minUpper' which is the upper bound of the outlier detection'''
vote = outlierVote(data)
x = pd.DataFrame(zip(data, vote), columns=['annual_income', 'outlier'])
if replace == 'median':
    replace = x.debt.median()
elif replace == 'minUpper':
    replace = min([val for (val, vote) in list(zip(data, vote)) if vote == True])
    if replace < data.mean():
        return 'There are outliers lower than the sample mean'
debtNew = []
for i in range(x.shape[0]):
    if x.iloc[i][1] == True:
        debtNew.append(replace)
    else:
        debtNew.append(x.iloc[i][0])

return debtNew

功能调用：

incomeNew = replaceOutlier(df.annual_income, replace='minUpper')

错误： x = pd.DataFrame（zip（数据，投票），列= [＆＃39; annual_income＆＃39;，＆＃39; outlier＆＃39;]） TypeError：data参数不能是迭代器

PS：我明白以前曾经问过这个问题，但是我尝试使用这些技术，但错误仍然存在

Answer 1

zip不能直接使用，你应该将结果作为列表给出，即：。

x = pd.DataFrame(list(zip(data, vote)), columns=['annual_income', 'outlier'])

Answer 2

这实际上在熊猫0.24.2版本中有效，而不必使用zip列表

Answer 3

发生这种情况是由于数据类型问题，您可以先将其转换为列表，然后再将该列表转换为数据框。例如pd.DataFrame（list（data））应该可以工作。

Answer 4

zip(list1,list2)在Jupyter Notebook中可用，但我发现list(zip(list1,list2))在Python的默认编译器中是必需的。

Answer 5

这样写

coef = DataFrame(list(zip(x.columns,np.transpose(log_model.coef_))))

Python：数据参数不能是迭代器

5 个答案: