Screened Tables

Home > Model Development Topics > Screened Tables

Screened tables let model developers implement Statistical Disclosure Control policies at the cell level of entity tables. Screened tables can also be used to suppress statistically unreliable cells, or to round values to avoid impressions of spurious accuracy in model outputs.

Topic contents

Introduction Introduction
Overview Overview
Syntax and simple example Syntax and simple example
Limitations Limitations
Extrema Collections Extrema collections
Arguments Arguments of the transformation function
Examples Examples

Introduction

content to follow.

Selected References:

[back to topic contents]

Overview

The examples in this topic are based on the model SM1 in the OpenM++ distribution in OM_ROOT/models/SM1. SM1 is a simple model which adds several attributes to the NewCaseBased model to provide raw material for the examples in this topic.

The example tables in this topic can be pasted into the module SM1/code/ScreenedTables.ompp, and example screening code can be pasted into the body of the function TransformScreened1 in SM1/code/ScreeneingCode.ompp.

[back to topic contents]

Syntax and simple example

To activate one of the four screening methods for a given entity table, include exactly one of the four keywords screened1, screened2, screened3, or screened4 in the table's properties, and supply a definition for the corresponding C++ transformation function. For example, the following table is declared in SM1/code/ScreenedTables.ompp:

table snapshot screened1 Person EarningsAt50
[trigger_entrances(integer_age, 50)]
{
    {
        unit,           //EN Persons
        mean(earnings), //EN Average earnings
        P50(earnings)   //EN Median earnings
    }
};

The table property screened1 specifies that it will be screened using screening method #1, which uses the C++ transformation function TransformScreened1.

That means that each of the three accumulators (statistics) in the single cell of EarningsAt50 will be subject to modification by the developer-supplied C++ function TransformScreened1. This function is called automatically when the simulation of each sub/member/replicate completes, for each value in this table.

The developer-supplied function TransformScreened1 takes 8 arguments (described below), whose values are supplied by the framework. The first argument in_value is a value in the table before modification. The function returns the possibly modified value. Here's an example function body of TransformScreened1 defined in the model code module SM1/code/ScreeningCode.ompp:

double TransformScreened1(
    const double in_value,
...
)
{
    /// transformed value, initialized to quiet NaN (shows as empty)
    double out_value = UNDEF_VALUE;

    // notional example of transformation (round to 00's)
    
    // scale value down by 100x
    out_value = in_value / 100.0;
    // round to nearest integer
    out_value = std::trunc(out_value + 0.5);
    // scale value up by 100x
    out_value *= 100.0;

    return out_value;
}

The C++ code in this example rounds all table values to the nearest 100. Note that this simplified code might not perform as desired for negative values.

An unscreened version of this table (without screened1 in the table declaration), produces these results:

Value	Label
2538	Persons
99203.9	Average earnings
101191	Median earnings

whereas the screened version (with screened1 in the table declaration as above), produces these results:

Value	Label
2500	Persons
99200	Average earnings
101200	Median earnings

To examine the transformation process in action, build the Debug version of SM1 and set breakpoints in the function TransformScreened1 in the module ScreeningCode.ompp.

[back to topic contents]

Limitations

content to follow

[back to topic contents]

Extrema collections

The screening function arguments smallest and largest are collections containing the highest M and lowest M observations in the table cell containing the value being screened, where M is a configurable constant for each of the four screening methods. These collections allow implementing 'dominance' rules for cell suppression, e.g. suppress a cell total if the top 3 observations in the cell account for more than 70% of the total. To set an appropriate value for M, use the corresponding option screened[1-4]_extremas_size in model code. For example, the following statement retains the highest 3 and lowest 3 observations in the smallest and largest extrema collections for method #1:

options screened1_extremas_size = 3;

Extrema collections might be smaller than the specified size if there are fewer than that many observations in the cell.

Extrema collections can contain the special floating point values +inf and -inf. They never contain the special floating point value NaN, because a NaN increment is treated as a run-time model error by OpenM++.

Reducing the size of extrema collections reduces memory and processing requirements.

M is set to 0 by default for all four screening methods, to avoid the computational and memory overhead of maintaining these collections for each cell of each screened table unless needed by the screening method.

[back to topic contents]

Arguments

Subtopics:

Screening function header
statistic enumeration
increment enumeration

The rows of the following table describe the 8 arguments of a screening function. The example column contains values pasted from a debugger session on a breakpoint in the function TransformScreened1 in SM1, on the second invocation of the function.

Name	Example	Notes
`in_value`	`99203.855397951411`	The true value in the table, which can be transformed or suppressed by code in the transformation function.
`description`	`EarningsAt50: accumulator 1: mean(value_out(interval(earnings)))`	A descriptive string which can be useful in debugging sessions. Don't attempt to parse it for content.
`statistic`	`mean (4)`	The enumerator for the statistic for use in function code, e.g. `omr::mean`
`increment`	`value_out (7)`	The enumerator for the increment for use in function code, e.g. `omr::value_out`
`observations`	`2538.0000000000000`	The number of observations (increments) in the cell. It is always unweighted, even if the table is weighted. The value is less than the 5,000 cases in the `SM1` Default run due to mortality before age 50.
`extrema_size`	`3`	Identical to the value supplied in the `screened1_extremas_size` option for use in function code. This is the maximum possible size of the extrema collections. The actual size may be less if there are fewer observations in the cell.
`smallest`	`{0.0000000000000000, 0.0000000000000000, 0.0000000000000000}`	The extrema collection is one of the standard C++ container types. See code examples elsewhere in this topic for use. The values are all zero in this example because the distribution of earnings in `SM1` is mixed discrete-continuous, with a large subpopulation having zero earnings.
`largest`	`{370810.00000000000, 398272.00000000000, 484007.00000000000}`	The three highest observed earnings.

Screening function header

/**
 * Table screening transformation function #1
 *
 * @param   in_value     The table value subject to transformation.
 * @param   description  A formatted string describing the table and statistic.
 * @param   statistic    The statistic of the accumulator, e.g. sum, mean.
 * @param   increment    The increment of the accumulator, e.g. delta, value_out.
 * @param   observations The count of observations in the cell (# of increments).
 * @param   extrema_size The maximum size M of the two extrema collections (configurable)
 * @param   smallest     The extrema collection containing the smallest M observations.
 * @param   largest      The extrema collection containing the largest M observations.
 *
 * @returns The transformed version of in_value.
 */
double TransformScreened1(
    const double in_value,
    const char* description,
    const omr::stat statistic,
    const omr::incr increment,
    const double observations,
    const size_t extrema_size,
    const std::multiset<double>& smallest,
    const std::multiset<double>& largest
);

[back to Arguments]
[back to topic contents]

statistic enumeration

namespace omr {
    /// statistic in an entity table
    enum class stat {
        unit,
        sum,
        minimum,
        maximum,
        mean,
        variance,
        stdev,
        P1,
        P2,
        P5,
        P10,
        P20,
        P25,
        P30,
        P40,
        P50,
        P60,
        P70,
        P75,
        P80,
        P90,
        P95,
        P98,
        P99,
        gini,
    };
} // namespace omr

[back to Arguments]
[back to topic contents]

increment enumeration

namespace omr {
    /// increment in an entity table
    enum class incr {
        unused,
        delta,
        delta2,
        nz_delta,
        value_in,
        value_in2,
        nz_value_in,
        value_out,
        value_out2,
        nz_value_out,
    };
} // namespace omr

[back to Arguments]
[back to topic contents]

Examples

content to follow.

[back to topic contents]

Home

Getting Started

Model development in OpenM++

Using OpenM++

Model Development Topics

Highlight: hook to self-scheduling or trigger attribute
Highlight: The End of Start
Highlight: Enumeration index validity and the index_errors option
Highlight: Simplified iteration of range, classification, partition
Highlight: Parameter, table, and attribute groups can be populated by module declarations
All Models
All options
Authored Model Documentation
Built-in Attributes
Censor Event Time
Create Import Set
Derived Attributes
Derived Tables
Entity Attributes in C++
Entity Function Hooks
Entity Member Packing
Entity Tables
Enumerations
Events
Event Trace
Experienced Modgen Developer
External Names
Floating Point Exceptions
Generated Model Documentation
Groups
Illustrative Model Align1
Lifecycle Attributes
Local Random Streams
Memory Use
Microdata Output
Model Code
Model Documentation
Model Languages
Model Localization
Model Metrics Report
Model Resource Use
Model Symbols
Parameter and Table Display and Content
Population Size and Scaling
Random Stream Generators
Run Memory Prediction
Screened Tables
Symbol Labels and Notes
Tables
Test Models
Time-like and Event-like Attributes
Use Modules
Weighted Tabulation
File-based Parameter Values

OpenM++ web-service: API and cloud setup

Using OpenM++ from Python and R

Docker

OpenM++ Development

OpenM++ Design, Roadmap and Status

OpenM++ web-service API

GET Model Metadata

GET Model Extras

GET Model Run results metadata

GET Model Workset metadata: set of input parameters

Read Parameters, Output Tables or Microdata values

GET Parameters, Output Tables or Microdata values

GET Parameters, Output Tables or Microdata as CSV

GET Modeling Task metadata and task run history

Update Model Profile: set of key-value options

Update Model Workset: set of input parameters

Update Model Runs

Update Modeling Tasks

Run Models: run models and monitor progress

Download model, model run results or input parameters

Upload model runs or worksets (input scenarios)

Download and upload user files

User: manage user settings

Model run jobs and service state

Administrative: manage web-service state

Screened Tables

Related topics

Topic contents

Introduction

Overview

Syntax and simple example

Limitations

Extrema collections

Arguments

Screening function header

statistic enumeration

increment enumeration

Examples

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!