我正在尝试使用sklearn适应随机森林。 每次运行算法时,都会遇到错误:
ValueError: could not convert string to float: '#DIV/0!'
在StackOverFlow上搜索时,我发现它可能正在发生,因为我试图除以零。为了避免这种情况,我将数据帧中的每个值乘以100,然后将每个0替换为1:给定新值的范围,那1就是不相关的,或者至少这就是我的想法。我使用的代码是:
df = df.mul(100)
df = df.replace(0, 1)
会发生什么,如果我现在尝试适应我的RF,则会收到一个新错误:
ValueError: could not convert string to float: '-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932-30.68464932'
我100%确信我没有在数据集中使用任何字符串作为值。 这是一个小样本:
所以我的问题现在变成:如何解决这个问题?
编辑
通过使用“ df.info”,我发现有一个对象。我使用以下单线解决了此问题:
df = df.apply(lambda col:pd.to_numeric(col, errors='coerce'))
现在所有值的格式均为“ float64”。 问题是,现在我收到一个新错误:
ValueError: Input contains NaN, infinity or a value too large for dtype('float32').
答案 0 :(得分:0)
好的,通过进一步的研究,我发现了第二个单线性解决了我的问题:现在拟合成功了。
df = df[~df.isin([np.nan, np.inf, -np.inf]).any(1)]