用MissForest插值DataFame的列的问题

时间:2019-03-30 16:09:24

标签: python imputation

所以我试图估算一个DataFrame的列,但出现此错误。

(这是对一个特定列的估算)

function downloadAsFile( canvas, imagename, mime ) {
    mime = mime || 'image/png';
    imagename = imagename || 'canvasImage.png';
    canvas.toBlob( blob => {
      if ( window.navigator.msSaveBlob ) { // IE and Edge
        window.navigator.msSaveBlob( blob, imagename );
      }
      else { // Chrome, Firefox.  Not tested: Safari 
        const url = window.URL.createObjectURL( blob );
        const a = document.createElement( 'a' );
        document.body.appendChild( a );
        a.href = url;
        a.download = imagename;
        a.setAttribute( 'style', 'display:none;' );
        a.click();
        setTimeout( () => {
          window.URL.revokeObjectURL( url );
          document.body.removeChild( a );
        }, 2000);
      }
    }, mime );
  }

但是我得到这个错误:

 from missingpy import MissForest
 imputer = MissForest()
 Imputed_Pollutants = imputer.fit_transform(df4.Ammonia)

当我尝试重塑它时:

 Expected 2D array, got 1D array instead

我仍然遇到错误:

 r = df4.Ammonia.reshape(-1,1)

 Imputed_Pollutants = imputer.fit_transform(r)

这就是r的样子:

 One or more columns have all rows missing

这是 氨气 列在重塑之前的样子:

r:

array([[nan],
   [nan],
   [nan],
   ...,
   [nan],
   [nan],
   [nan]])

任何建议都将不胜感激,谢谢大家。

1 个答案:

答案 0 :(得分:0)

您的候选列是氨,您要输入缺失值的列。 missingpy库的MissForest()使用其余所有列来实现。更多https://pypi.org/project/missingpy/ 所以试试这个:

    from missingpy import MissForest
    imputer = MissForest()
    Imputed_Pollutants = imputer.fit_transform(df4)
    Imputed_Pollutants = pd.DataFrame(Imputed_Pollutants, columns=df4.columns)