熊猫优化导致转换为浮动时出现问题

时间:2019-06-06 12:11:03

标签: python json pandas

我正在尝试通过将DF类型转换为更高效的类型来优化我们的内存使用(今天,float64占了很多,它消耗8位并且非常低效)。

我写了下面的代码(如果您认为我认为这会从乞讨开始起作用的话是错误的,请告诉我)

@staticmethod
def optimize_float(series):
    low_consumption = series.astype('float16')
    if any(low_consumption.isin([np.inf])):
        medium_consumption = series.astype('float32')
        if any(medium_consumption.isin([np.inf])):
            high_consumption = series.astype('float64')
            return high_consumption
        return medium_consumption
    return low_consumption

此代码实际上大大减少了DF的内存使用量。

但是稍后在代码中,我得到以下错误:

Traceback (most recent call last):
  File "/Users//.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/flask/app.py", line 1985, in wsgi_app
    response = self.handle_exception(e)
  File "/Users//.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/flask_restful/__init__.py", line 273, in error_router
    return original_handler(e)
  File "/Users//.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/flask_cors/extension.py", line 161, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "/Users//.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/flask/app.py", line 1540, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/Users//.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/flask/_compat.py", line 32, in reraise
    raise value.with_traceback(tb)
  File "/Users//.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/flask/app.py", line 1982, in wsgi_app
    response = self.full_dispatch_request()
  File "/Users//.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/flask/app.py", line 1614, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/Users//.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/flask_restful/__init__.py", line 273, in error_router
    return original_handler(e)
  File "/Users//.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/flask_cors/extension.py", line 161, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "/Users//.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/flask/app.py", line 1518, in handle_user_exception
    return handler(e)
  File "/Users//Desktop/code/romee/autoai/server/__init__.py", line 186, in handle_general_exeption
    raise error
  File "/Users//.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/flask_restful/__init__.py", line 270, in error_router
    return self.handle_error(e)
  File "/Users//.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/flask/app.py", line 1612, in full_dispatch_request
    rv = self.dispatch_request()
  File "/Users//.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/flask_debugtoolbar/__init__.py", line 125, in dispatch_request
    return view_func(**req.view_args)
  File "/Users//.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/flask_restful/__init__.py", line 484, in wrapper
    return self.make_response(data, code, headers=headers)
  File "/Users//.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/flask_restful/__init__.py", line 513, in make_response
    resp = self.representations[mediatype](data, *args, **kwargs)
  File "/Users//.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/flask_restful/representations/json.py", line 21, in output_json
    dumped = dumps(data, **settings) + "\n"
  File "/Users//.pyenv/versions/3.6.5/lib/python3.6/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/Users//.pyenv/versions/3.6.5/lib/python3.6/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/Users//.pyenv/versions/3.6.5/lib/python3.6/json/encoder.py", line 430, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/Users//.pyenv/versions/3.6.5/lib/python3.6/json/encoder.py", line 404, in _iterencode_dict
    yield from chunks
  File "/Users//.pyenv/versions/3.6.5/lib/python3.6/json/encoder.py", line 325, in _iterencode_list
    yield from chunks
  File "/Users//.pyenv/versions/3.6.5/lib/python3.6/json/encoder.py", line 404, in _iterencode_dict
    yield from chunks
  File "/Users//.pyenv/versions/3.6.5/lib/python3.6/json/encoder.py", line 325, in _iterencode_list
    yield from chunks
  File "/Users//.pyenv/versions/3.6.5/lib/python3.6/json/encoder.py", line 404, in _iterencode_dict
    yield from chunks
  File "/Users//.pyenv/versions/3.6.5/lib/python3.6/json/encoder.py", line 396, in _iterencode_dict
    yield _floatstr(value)
  File "/Users//.pyenv/versions/3.6.5/lib/python3.6/json/encoder.py", line 241, in floatstr
    repr(o))
ValueError: Out of range float values are not JSON compliant: -inf

stacktrace不是很有用-因为它没有指出此错误来自 my 代码中的位置。

我测试了optimize float方法,看它没有返回np.inf值(if any(medium_consumption.isin([np.inf]))应该会解决这个问题。

我想念什么?

编辑: 我注意到在尝试向客户发送本系列中describe函数的内容时会发生这种情况。

可复制的示例:

         df = pd.DataFrame({'a':[11111,22222,3333]})
         df['a'].describe()
Out[84]: 
count        3.000000
mean     12222.000000
std       9493.383011
min       3333.000000
25%       7222.000000
50%      11111.000000
75%      16666.500000
max      22222.000000
Name: a, dtype: float64
          df['a'].astype('float16').describe()
Out[85]: 
count    3.000000e+00
mean     1.222400e+04
std               inf
min      3.332000e+03
25%      7.222000e+03
50%      1.111200e+04
75%      1.666800e+04
max      2.222400e+04
Name: a, dtype: float64

现在的问题是,我该如何舍入从describe返回的值,所以它永远不会是inf(但仍保留列float16

0 个答案:

没有答案