Published: 29.12.2023

I've been a heavy NumPy user for a long time and just recently learned that I have been wasting memory all along. Granted, the problems I've worked on have not required excessive amounts of memory, but still. Things are however different at Equinor where we regularly deal with hundreds of realizations of Gaussian Random Fields with millions of parameters.

Here, I'll show you a few tricks on how to save memory. To gain a deeper understanding on how NumPy works and learn more tricks, read The NumPy array: a structure for efficient numerical computation.

But before we look at some NumPy code I'll briefly introduce pytest-memray which measures the memory usage of tests.

## Using pytest-memray for Memory Measurement

With pytest-memray installed, tests can be decorated using **@pytest.mark.limit_memory**.
This decorator takes expected memory usage as parameter and causes tests to fail if they exceed the specified threshold.

The following pseudo-test will fail if it uses more memory than 100MiB of memory. Simple and useful.

` ````
# This test will fail
@pytest.mark.limit_memory("100 MB")
def test_memory_usage():
x = allocate_more_than_100MiB_of_memory()
```

Note that pytest-memray measures memory in MiB (Mebibytes) rather than MB (Megabytes). I'll refer to MB in this post since that's how the threshold is specified in the decorator.

## Example 1: In-Place Operations

To multiple a matrix by a constant you might do something like this:` ````
@pytest.mark.limit_memory("15.5 MB")
def test_memory_usage_without_in_place_operations():
A = rng.normal(size=(1000, 1000))
A = A * 2
```

Reduce memory usage by half through the use of in-place operations:
` ````
@pytest.mark.limit_memory("7.8 MB")
def test_memory_usage_with_in_place_operations():
A = rng.normal(size=(1000, 1000))
A *= 2
```

## Example 2: Creating Submatrices

A common way to create submatrices is this:

` ````
@pytest.mark.limit_memory("13.4 MB")
def test_memory_usage_when_creating_submatrices():
m = 1000
X = rng.normal(size=(m, m))
mask = np.arange(m, step=2)
A = X[mask, :][:, mask]
```

The problem with this is that **X[mask, :][:, mask]** creates the intermediate subarray **X[mask, :]**.
Avoid creating the subarray by making use of the somewhat funky-looking **np.ix_**:

` ````
@pytest.mark.limit_memory("9.7 MB")
def test_better_way_of_creating_submatrix():
m = 1000
X = rng.normal(size=(m, m))
mask = np.arange(m, step=2)
# np.ix_ creates an open mesh from multiple sequences
A = X[np.ix_(mask, mask)]
```

## Example 3: Multiplying Multiple Matrices

Multiplying multiple matrices can be done like this:` ````
@pytest.mark.limit_memory("23.5 MB")
def test_without_multidot():
m = 1000
A = rng.normal(size=(m, m))
B = rng.normal(size=(m, m))
x = rng.normal(size=(m, 1))
C = A @ B @ x
```

Use **np.multi_dot**to save some memory:

` ````
@pytest.mark.limit_memory("15.4 MB")
def test_with_multidot():
m = 1000
A = rng.normal(size=(m, m))
B = rng.normal(size=(m, m))
x = rng.normal(size=(m, 1))
C = np.linalg.multi_dot([A, B, x])
```

## Example 4: Efficiently Adding One to Diagonal

Here's an example from an algorithm implemented in ERT:\[\boldsymbol{\Omega} = \boldsymbol{I} + \boldsymbol{W}\left( \boldsymbol{I}_N - \frac{1}{N} \boldsymbol{1}\boldsymbol{1}^T \right) / \sqrt{N - 1}\]

The naive approach:` ````
@pytest.mark.limit_memory("15.3 MB")
def test_naively_adding_one_to_diagonal():
m = 1000
W = rng.normal(size=(m, m))
Omega = W + np.identity(m)
```

The less naive approach:
` ````
@pytest.mark.limit_memory("7.7 MB")
def test_better_adding_one_to_diagonal():
m = 1000
W = rng.normal(size=(m, m))
np.fill_diagonal(W, W.diagonal() + 1)
Omega = W
```

There you have it, a nice way of measuring memory usage pytest-memray and a few NumPy tricks that reduce memory usage.