Skip to content

Run Memory Prediction

esseff edited this page Mar 31, 2025 · 29 revisions

Home > Model Development Topics > Run memory prediction

This topic is under construction and/or revision.

Topic summary, two sentences max.

Related topics

Topic contents

Introduction and outline

Content to follow.

[back to topic contents]

Motivating example

Here are results for the 900k run with different amounts of memory in the Linux VM:

Platform (VM) Memory Run status Elapsed time Notes
Windows 8 GB Success 1d 3h 38m Some anomalies at end of run, probably due to long elapsed time with open files on the virtual file system.
Linux 4 GB Crash - Model crash during creation of starting population.
Linux 8 GB Crash 50m Model crash after progressive slowdown (~50m at 51% complete).
Linux 12 GB Crash 55m Model crash after progressive slowdown (~55m at 97% complete).
Linux 16 GB Success 29m Success.
Linux 20 GB Success 30m Success.

Model crash is the ultimate in memory stress, but progressive slowdown is also a sign. As the amount of free memory decreases, it becomes more difficult for memory allocation algorithms to find a free chunk of memory of a given size. The population is continually increasing during the run, so increasingly critical memory stress can occur progressively as simulation time advances when memory limits are approached.

In the 8 GB test, a point of no return was reached at 51%, after a progressive then a pronounced slowdown. In the 12 GB test, the point of return was reached at 97%, after a progressive then a pronounced slowdown.

The total amount of memory used by a model run will be similar in Windows and Linux versions. However, the symptoms of memory stress may differ depending on the OS and in its configuration, especially WRT paging of memory to disk.

Based on these results and the information about your tower, it looks likely that the memory requirements of a 900k run puts your tower into a memory stressed state with a very big performance hit, without causing the run to crash. There are technical reasons behind this behaviour (virtual memory paging to/from disk, memory fragmentation).

It looks like you need a machine with 16 GB or more of physical memory to complete a 900k run without memory stress or crashing.

If memory requirements of the model increase due to code changes, a 900k run might require more than 16 GB.

The kind of information in the table (how much memory is needed for a given population size) is also useful to do runs with multiple replicates (aka subs). You can gain parallelism and significant performance gains by using multiple threads, provided that each replicate has sufficient memory.

So a machine with (say) 32 GB of memory can safely run two 900k threads in parallel, which halves execution time in a run with multiple replicates.

Runs with smaller populations can benefit from even higher parallelism, e.g. 4 threads gives 4x run speed. That’s assuming the machine has sufficient CPU cores, which is probably the case for machines with lots of physical memory.

Ompp runs can gain even higher parallelism by running in parallel on multiple machines, using a tech called MPI and a backend grid of servers.

[back to topic contents]

Assisted calculation

Content to follow.

[back to topic contents]

Manual calculation

Under construction. Worked example

This subtopic contains the following sections.

[back to topic contents]

Manual calculation - Run 1

GMM Run #1 10k ompp_options.ompp:

//
// model resource options
// 

options resource_use = on; // collect and report resource use information at run-time.

// The following memory prediction options statements
//   for GMM were generated on 2023-04-01 18:03:36.340
//   using as popsize the parameter StartingPopulationSize = 10000
options memory_popsize_parameter = StartingPopulationSize;
options memory_MB_constant_per_instance = 0;        // was 1
options memory_MB_constant_per_sub = 4;        // was 5
options memory_MB_popsize_coefficient = 0.002142; // was 0.002000
options memory_adjustment_factor = 1.10;

GMM Run #1 10k log:

2025-03-24 00:41:28.423    *****************************
2025-03-24 00:41:28.424    *  Resource Use Prediction  *
2025-03-24 00:41:28.425    *****************************
2025-03-24 00:41:28.426
2025-03-24 00:41:28.427      +--------------------------------+
2025-03-24 00:41:28.428      | Resource Use by Persistence    |
2025-03-24 00:41:28.429      +------------------------+-------+
2025-03-24 00:41:28.431      | Persistence            |    MB |
2025-03-24 00:41:28.433      +------------------------+-------+
2025-03-24 00:41:28.434      | Constant per instance  |     0 |
2025-03-24 00:41:28.435      | Constant per sub       |     6 |
2025-03-24 00:41:28.436      | Variable by popsize    |    21 |
2025-03-24 00:41:28.437      +------------------------+-------+
2025-03-24 00:41:28.438      | Total                  |    28 |
2025-03-24 00:41:28.440      +------------------------+-------+
2025-03-24 00:41:28.441
2025-03-24 00:41:28.442       // The following memory prediction options statements
2025-03-24 00:41:28.443       //   for GMM were generated on 2025-03-24 00:33:34.852
2025-03-24 00:41:28.444       //   using as popsize the parameter StartingPopulationSize = 10000
2025-03-24 00:41:28.445       options memory_popsize_parameter = StartingPopulationSize;
2025-03-24 00:41:28.446       options memory_MB_constant_per_instance = 0;        // was 0
2025-03-24 00:41:28.448       options memory_MB_constant_per_sub      = 6;        // was 4
2025-03-24 00:41:28.449       options memory_MB_popsize_coefficient   = 0.002169; // was 0.002142
2025-03-24 00:41:28.451       options memory_adjustment_factor        = 1.10;

[back to manual calculation]
[back to topic contents]

Manual calculation - Run 2

GMM Run #2 100k ompp_options.ompp:

//
// model resource options
// 

//options resource_use = on; // collect and report resource use information at run-time.

// The following memory prediction options statements
//   for GMM were generated on 2025-03-24 00:33:34.852
//   using as popsize the parameter StartingPopulationSize = 10000
options memory_popsize_parameter = StartingPopulationSize;
options memory_MB_constant_per_instance = 0;        // was 0
options memory_MB_constant_per_sub = 6;        // was 4
options memory_MB_popsize_coefficient = 0.002169; // was 0.002142
options memory_adjustment_factor = 1.10;

GMM Run #2 100k Log:

2025-03-24 00:55:43.392 member=0 Predicted memory required = 245 MB per parallel sub and 0 MB per instance
...
2025-03-24 02:20:05.173 Process peak memory usage: 637.01 MB
...

Clearly, for this run of GMM, the predicted memory use of 245 MB was very different from the actual peak memory use of 637 MB. A manual calculation is called for.

[back to manual calculation]
[back to topic contents]

Manual calculation - Run 3

GMM Run #3 110k Log:

2025-03-24 15:27:01.431 Process peak memory usage: 662.32 MB

Calculations:

Run #3 population size was 110,000 compared to 100,000 for Run #2. The peak memory use reported in the line Process peak memory usage: in the run logs was 637.01 MB for Run #2 and 662.32 MB for Run #3. So, an additional 10000 in population size required an additional 25.31 MB of memory. Combining these numbers gives a marginal requirement of 0.002531 MB of memory per unit of population size.

That produces the manual calculation of the variable component:

options memory_MB_popsize_coefficient = 0.002531;

The constant portion of memory use can be computed using that coefficient assuming linearity. Using the marginal coefficient calculated above, the variable component of memory use for Run #2 is
0.002531 * 100000 = 253.10 MB.

The constant portion of memory use is the difference between peak memory use and variable memory use:

637.01 - 253.10 = 383.91 MB.

This is a mixture of per instance and per sub memory use.

The per instance and per sub portions of constant memory use could be distinguished using additional probing runs with and without multiple subs in an instance. In this example, that might be a Run #2b with two subs in a single instance, each of size 100,000.

In practice for most model designs, the per instance portion can be rolled into the per sub portion with only minor effects. Moreover, that is a conservative assumption for predicting maximum memory requirements.

As it turns out, each sub in a GMM multi-sub needs to run in its own distinct process anyway, for technical reasons, so there is no practical distinction between per instance and per sub memory use for GMM.

So, the option settings for GMM constant memory are:

options memory_MB_constant_per_instance = 0;
options memory_MB_constant_per_sub = 384;

Putting it all together, here is the model code fragment containing all the option settings for predicting memory requirements for GMM runs like Run #2:

// The following memory prediction options statements
//   for GMM were estimated manually on 2025-03-24
//   using as popsize the parameter StartingPopulationSize.
//   A pair of runs were used in the estimation,
//   the first with a population size of 100000 and the second with 110000.

options memory_popsize_parameter = StartingPopulationSize;
options memory_MB_constant_per_instance = 0;
options memory_MB_constant_per_sub = 384;
options memory_MB_popsize_coefficient = 0.002531;
options memory_adjustment_factor = 1.10;

These option settings were tested with Run #4 with these option settings and a starting population size of 250,000, which is a typical starting population size for a run of GMM.
Here's an extract of the run log:

[back to manual calculation]
[back to topic contents]

Manual calculation - Run 4

GMM Run #4 250k Log:

...
2025-03-27 10:31:13.071 member=0 Predicted memory required = 1118 MB per parallel sub and 0 MB per instance
...
2025-03-27 14:17:02.819 Process peak memory usage: 1005.64 MB
...

The predicted memory requirements worked well, for a run with over 2x the population of those used to manually compute the option values for memory prediction.

The predicted memory requirements are deliberately higher than actual peak memory use because the option memory_adjustment_factor increased the estimated memory use by 10%.

[back to manual calculation]
[back to topic contents]

Home

Getting Started

Model development in OpenM++

Using OpenM++

Model Development Topics

OpenM++ web-service: API and cloud setup

Using OpenM++ from Python and R

Docker

OpenM++ Development

OpenM++ Design, Roadmap and Status

OpenM++ web-service API

GET Model Metadata

GET Model Extras

GET Model Run results metadata

GET Model Workset metadata: set of input parameters

Read Parameters, Output Tables or Microdata values

GET Parameters, Output Tables or Microdata values

GET Parameters, Output Tables or Microdata as CSV

GET Modeling Task metadata and task run history

Update Model Profile: set of key-value options

Update Model Workset: set of input parameters

Update Model Runs

Update Modeling Tasks

Run Models: run models and monitor progress

Download model, model run results or input parameters

Upload model runs or worksets (input scenarios)

Download and upload user files

User: manage user settings

Model run jobs and service state

Administrative: manage web-service state

Clone this wiki locally