Pandas DataFrame Memory Usage – Python

DataFrame.memory_usage() is used to get the memory usage of each column in bytes.

The memory usage can optionally include the contribution of the index and elements of object dtype.

This value is displayed in DataFrame.info by default. This can be suppressed by setting pandas.options.display.memory_usage to False.

  • Syntax:
    df.memory_usage(index=True, deep=False)
  • Parameters:
    index : bool, default True
    Specifies whether to include the memory usage of the DataFrame's index in returned Series. If index=True, the memory usage of the index is the first item in the output.
    
    deep : bool, default False
    If True, introspect the data deeply by interrogating object dtypes for system-level memory consumption, and include it in the returned values.
  • Returns:
  • Series = A Series whose index is the original column names and whose values is the memory usage of each column in bytes.
  • Examples:
    >>> dtypes = ['int64', 'float64', 'complex128', 'object', 'bool']
    >>> data = dict([(t, np.ones(shape=5000).astype(t))
    ...              for t in dtypes])
    >>> df = pd.DataFrame(data)
    >>> df.head()
       int64  float64            complex128  object  bool
    0      1      1.0    1.000000+0.000000j       1  True
    1      1      1.0    1.000000+0.000000j       1  True
    2      1      1.0    1.000000+0.000000j       1  True
    3      1      1.0    1.000000+0.000000j       1  True
    4      1      1.0    1.000000+0.000000j       1  True
    >>> df.memory_usage()
    Index           128
    int64         40000
    float64       40000
    complex128    80000
    object        40000
    bool           5000
    dtype: int64
    >>> df.memory_usage(index=False)
    int64         40000
    float64       40000
    complex128    80000
    object        40000
    bool           5000
    dtype: int64
    The memory footprint of object dtype columns is ignored by default:
    
    >>> df.memory_usage(deep=True)
    Index            128
    int64          40000
    float64        40000
    complex128     80000
    object        160000
    bool            5000
    dtype: int64
    Use a Categorical for efficient storage of an object-dtype column with
    many repeated values.
    
    >>> df['object'].astype('category').memory_usage(deep=True)
    5216
  • numpy.ndarray.nbytes
    Total bytes consumed by the elements of an ndarray.
  • Series.memory_usage
    Bytes consumed by a Series.
  • Categorical
    Memory-efficient array for string values with many repeated values.
  • DataFrame.info
    Concise summary of a DataFrame.