尝试创建摘要统计信息的DF时发生ValueError

时间:2019-08-07 17:22:33

标签: python pandas

我想创建一个只有两列计数的数据框。每次尝试计数时,都会出现此错误:

ValueError: cannot convert float NaN to integer

这是我的代码:

BoxTrackingSummary_df = pd.DataFrame()
BoxTrackingSummary_df_columns = ['School - Exams Tracked', 'School - Exams Not Tracked']

summary_group = BoxTrackingReport_df.groupby('Tracked At A Site?').agg('count')['All Box Tracked Sites']


BoxTrackingSummary_df['School - Exams Tracked'] = summary_group['NO']
BoxTrackingSummary_df['School - Exams Not Tracked'] = summary_group['TRACKED']

我的摘要组返回:

Tracked At A Site?
NO          3
TRACKED    18
Name: All Box Tracked Sites, dtype: int64

我想要一个看起来像这样的数据框:

School - Exams Tracked         School - Exams Not Tracked
                    18                                  3

完整追溯:

Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.1.3\helpers\pydev\_pydev_comm\server.py", line 34, in handle
    self.processor.process(iprot, oprot)
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.1.3\helpers\third_party\thriftpy\_shaded_thriftpy\thrift.py", line 266, in process
    self.handle_exception(e, result)
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.1.3\helpers\third_party\thriftpy\_shaded_thriftpy\thrift.py", line 254, in handle_exception
    raise e
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.1.3\helpers\third_party\thriftpy\_shaded_thriftpy\thrift.py", line 263, in process
    result.success = call()
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.1.3\helpers\third_party\thriftpy\_shaded_thriftpy\thrift.py", line 228, in call
    return f(*(args.__dict__[k] for k in api_args))
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.1.3\helpers\pydev\_pydev_bundle\pydev_console_utils.py", line 236, in getArray
    return pydevd_thrift.table_like_struct_to_thrift_struct(array, name, roffset, coffset, rows, cols, format)
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.1.3\helpers\pydev\_pydevd_bundle\pydevd_thrift.py", line 602, in table_like_struct_to_thrift_struct
    return TYPE_TO_THRIFT_STRUCT_CONVERTERS[type_name](array, name, roffset, coffset, rows, cols, format)
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.1.3\helpers\pydev\_pydevd_bundle\pydevd_thrift.py", line 545, in dataframe_to_thrift_struct
    array_chunk.headers = header_data_to_thrift_struct(rows, cols, dtypes, col_bounds, col_to_format, df, dim)
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.1.3\helpers\pydev\_pydevd_bundle\pydevd_thrift.py", line 577, in header_data_to_thrift_struct
    col_header.max = col_format % bounds[1]

我的工作流程是我创建单独的数据框并进行测试。我的测试实用程序经过优化,可以接受dfs而非序列号,因此即使对于简单的摘要表,我也希望将其作为df。

1 个答案:

答案 0 :(得分:0)

我忘了使用.loc:

BoxTrackingSummary_df['School - Exams Tracked'] = summary_group['NO'] BoxTrackingSummary_df['School - Exams Not Tracked'] = summary_group['TRACKED']