从Pandas DataFrame.info()获取内存使用价值

时间:2018-04-26 18:16:19

标签: pandas

如何获取内存使用值(显示在函数@$row的输出中并分配给变量?

2 个答案:

答案 0 :(得分:3)

DataFrame.memory_usage().sum()

this page上有一个例子:

In [8]: df.memory_usage()
Out[8]: 
Index                 72
bool                5000
complex128         80000
datetime64[ns]     40000
float64            40000
int64              40000
object             40000
timedelta64[ns]    40000
categorical         5800
dtype: int64

# total memory usage of dataframe
In [9]: df.memory_usage().sum()
Out[9]: 290872

查看df.info()的源代码显示,使用memory_usage()是他们计算df.info()中实际内存使用情况的方式:

... <last few lines of def info from pandas/frame.py>
    mem_usage = self.memory_usage(index=True, deep=deep).sum()
    lines.append("memory usage: %s\n" %
                 _sizeof_fmt(mem_usage, size_qualifier))
_put_lines(buf, lines)

答案 1 :(得分:3)

正如docs所说,我们应该有buffer

buf : writable buffer, defaults to sys.stdout

df

import io
impor pandas as pd
df=pd.DataFrame({
    'someCol' : ["foo", "bar"]
}) 
buf = io.StringIO()
df.info(buf=buf)
info = buf.getvalue()
print(info)

给我输出:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 1 columns):
someCol    2 non-null object
dtypes: object(1)
memory usage: 96.0+ bytes

对于特定的内存使用值:

info = buf.getvalue().split('\n')[-2]
print(info)

会给出输出:

memory usage: 96.0+ bytes