pandas_profiling.ProfileReport错误:ValueError:无法将float NaN转换为整数

时间:2018-11-30 09:30:54

标签: python-3.x pandas

我有一个称为Combined的数据框。我将这个数据框的一个子集称为A。当我对Profilereport进行合并时,没有问题。当我为A做报告时,出现上述错误。这里的代码:

A = combined.loc[combined.xy== False]
pandas_profiling.ProfileReport(A) #this gives me the error
pandas_profiling.ProfileReport(combined.loc[combined.xy== False]) #same error
pandas_profiling.ProfileReport(combined) # no error

这是错误:

    C:\Users\xy\AppData\Local\Continuum\Anaconda2\envs\py36\lib\site-packages\pandas_profiling\report.py:60: RuntimeWarning: invalid value encountered in longlong_scalars
  width = int(freq / max_freq * 99) + 1
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-74-bf3aa50b97ad> in <module>()
----> 1 pandas_profiling.ProfileReport(A)

~\AppData\Local\Continuum\Anaconda2\envs\py36\lib\site-packages\pandas_profiling\__init__.py in __init__(self, df, **kwargs)
     67 
     68         self.html = to_html(sample,
---> 69                             description_set)
     70 
     71         self.description_set = description_set

~\AppData\Local\Continuum\Anaconda2\envs\py36\lib\site-packages\pandas_profiling\report.py in to_html(sample, stats_object)
    172                                                        templates.template('freq_table'), templates.template('freq_table_row'), 10)
    173             formatted_values['firstn_expanded'] = extreme_obs_table(stats_object['freq'][idx], templates.template('freq_table'), templates.template('freq_table_row'), 5, n_obs, ascending = True)
--> 174             formatted_values['lastn_expanded'] = extreme_obs_table(stats_object['freq'][idx], templates.template('freq_table'), templates.template('freq_table_row'), 5, n_obs, ascending = False)
    175 
    176         rows_html += templates.row_templates_dict[row['type']].render(values=formatted_values, row_classes=row_classes)

~\AppData\Local\Continuum\Anaconda2\envs\py36\lib\site-packages\pandas_profiling\report.py in extreme_obs_table(freqtable, table_template, row_template, number_to_print, n, ascending)
    123 
    124         for label, freq in six.iteritems(obs_to_print):
--> 125             freq_rows_html += _format_row(freq, label, max_freq, row_template, n)
    126 
    127         return table_template.render(rows=freq_rows_html)

~\AppData\Local\Continuum\Anaconda2\envs\py36\lib\site-packages\pandas_profiling\report.py in _format_row(freq, label, max_freq, row_template, n, extra_class)
     58 
     59     def _format_row(freq, label, max_freq, row_template, n, extra_class=''):
---> 60             width = int(freq / max_freq * 99) + 1
     61             if width > 20:
     62                 label_in_bar = freq

ValueError: cannot convert float NaN to integer

我希望你们能帮助我。

1 个答案:

答案 0 :(得分:0)

我通过以下功能解决了这个问题:

for c in DB:
print(c)
print(DB[c].dtypes)
if DB[c].dtypes != bool and DB[c].dtypes != np.float64 and DB[c].dtypes != np.uint64 and DB[c].dtypes != np.uint64 and DB[c].dtypes != np.uint8 and DB[c].dtypes != np.datetime64 and DB[c].dtypes != np.timedelta64 and DB[c].dtypes != np.dtype('<m8[ns]'):
    DB[c] = DB[c].astype("str")
    DB[c] = DB[c].astype("category")

elif DB[c].dtypes == bool:
    DB[c] = DB[c].astype("int")

这会将除布尔值,日期和几种数字类型以外的所有数据类型转换为字符串,然后转换为类别。它必须先是字符串然后是类别,否则错误仍然存​​在。