-
Notifications
You must be signed in to change notification settings - Fork 0
Screened Tables
Home > Model Development Topics > Screened Tables
Screened tables let model developers implement Statistical Disclosure Control policies at the cell level of entity tables. Screened tables can also be used to suppress statistically unreliable cells, or to round values to avoid impressions of spurious accuracy in model outputs.
- Introduction Introduction
- Overview Overview
- Syntax and simple example Syntax and simple example
- Limitations Limitations
- Extrema Collections Extrema collections
- Arguments Arguments of the transformation function
- Examples Examples
content to follow.
Selected References:
The examples in this topic are based on the model SM1
in the OpenM++ distribution in OM_ROOT/models/SM1
.
SM1
is a simple model which adds several attributes to the NewCaseBased
model to provide raw material for the examples in this topic.
The example tables in this topic can be pasted into the module SM1/code/ScreenedTables.ompp
,
and example screening code can be pasted into the body of the function TransformScreened1
in SM1/code/ScreeneingCode.ompp
.
To activate one of the four screening methods for a given entity table,
include exactly one of the four keywords screened1
, screened2
, screened3
, or screened4
in the table's properties,
and supply a definition for the corresponding C++ transformation function.
For example, the following table is declared in SM1/code/ScreenedTables.ompp
:
table snapshot screened1 Person EarningsAt50
[trigger_entrances(integer_age, 50)]
{
{
unit, //EN Persons
mean(earnings), //EN Average earnings
P50(earnings) //EN Median earnings
}
};
The table property screened1
specifies that it will be screened using screening method #1,
which uses the C++ transformation function TransformScreened1
.
That means that each of the three accumulators (statistics) in the single cell of EarningsAt50
will be subject to modification by the developer-supplied C++ function TransformScreened1
.
This function is called automatically when the simulation of each sub/member/replicate completes,
for each value in this table.
The developer-supplied function TransformScreened1
takes 8 arguments (described below),
whose values are supplied by the framework.
The first argument in_value
is a value in the table before modification.
The function returns the possibly modified value.
Here's an example function body of TransformScreened1
defined in the model code module SM1/code/ScreeningCode.ompp
:
double TransformScreened1(
const double in_value,
...
)
{
/// transformed value, initialized to quiet NaN (shows as empty)
double out_value = UNDEF_VALUE;
// notional example of transformation (round to 00's)
// scale value down by 100x
out_value = in_value / 100.0;
// round to nearest integer
out_value = std::trunc(out_value + 0.5);
// scale value up by 100x
out_value *= 100.0;
return out_value;
}
The C++ code in this example rounds all table values to the nearest 100. Note that this simplified code might not perform as desired for negative values.
An unscreened version of this table (without screened1
in the table declaration),
produces these results:
Value | Label |
---|---|
2538 | Persons |
99203.9 | Average earnings |
101191 | Median earnings |
whereas the screened version (with screened1
in the table declaration as above),
produces these results:
Value | Label |
---|---|
2500 | Persons |
99200 | Average earnings |
101200 | Median earnings |
To examine the transformation process in action,
build the Debug version of SM1
and set breakpoints in the function TransformScreened1
in the module ScreeningCode.ompp
.
content to follow
The screening function arguments smallest
and largest
are collections containing the highest M and lowest M observations in the table cell containing the value being screened,
where M is a configurable constant for each of the four screening methods.
These collections allow implementing 'dominance' rules for cell suppression, e.g. suppress a cell total if the top 3 observations in the cell account for more than 70% of the total.
To set an appropriate value for M, use the corresponding option screened[1-4]_extremas_size
in model code.
For example, the following statement retains the highest 3 and lowest 3 observations in the smallest
and largest
extrema collections for method #1:
options screened1_extremas_size = 3;
Extrema collections might be smaller than the specified size if there are fewer than that many observations in the cell.
Extrema collections can contain the special floating point values +inf and -inf. They never contain the special floating point value NaN, because a NaN increment is treated as a run-time model error by OpenM++.
Reducing the size of extrema collections reduces memory and processing requirements.
M is set to 0 by default for all four screening methods, to avoid the computational and memory overhead of maintaining these collections for each cell of each screened table unless needed by the screening method.
Subtopics:
The rows of the following table describe the 8 arguments of a screening function.
The example column contains values pasted from a debugger session
on a breakpoint in the function TransformScreened1
in SM1
,
on the second invocation of the function.
Name | Example | Notes |
---|---|---|
in_value |
99203.855397951411 |
The true value in the table, which can be transformed or suppressed by code in the transformation function. |
description |
EarningsAt50: accumulator 1: mean(value_out(interval(earnings))) |
A descriptive string which can be useful in debugging sessions. Don't attempt to parse it for content. |
statistic |
mean (4) |
The enumerator for the statistic for use in function code, e.g. omr::mean
|
increment |
value_out (7) |
The enumerator for the increment for use in function code, e.g. omr::value_out
|
observations |
2538.0000000000000 |
The number of observations (increments) in the cell. It is always unweighted, even if the table is weighted. The value is less than the 5,000 cases in the SM1 Default run due to mortality before age 50. |
extrema_size |
3 |
Identical to the value supplied in the screened1_extremas_size option for use in function code. This is the maximum possible size of the extrema collections. The actual size may be less if there are fewer observations in the cell. |
smallest |
{0.0000000000000000, 0.0000000000000000, 0.0000000000000000} |
The extrema collection is one of the standard C++ container types. See code examples elsewhere in this topic for use. The values are all zero in this example because the distribution of earnings in SM1 is mixed discrete-continuous, with a large subpopulation having zero earnings. |
largest |
{370810.00000000000, 398272.00000000000, 484007.00000000000} |
The three highest observed earnings. |
/**
* Table screening transformation function #1
*
* @param in_value The table value subject to transformation.
* @param description A formatted string describing the table and statistic.
* @param statistic The statistic of the accumulator, e.g. sum, mean.
* @param increment The increment of the accumulator, e.g. delta, value_out.
* @param observations The count of observations in the cell (# of increments).
* @param extrema_size The maximum size M of the two extrema collections (configurable)
* @param smallest The extrema collection containing the smallest M observations.
* @param largest The extrema collection containing the largest M observations.
*
* @returns The transformed version of in_value.
*/
double TransformScreened1(
const double in_value,
const char* description,
const omr::stat statistic,
const omr::incr increment,
const double observations,
const size_t extrema_size,
const std::multiset<double>& smallest,
const std::multiset<double>& largest
);
[back to Arguments]
[back to topic contents]
namespace omr {
/// statistic in an entity table
enum class stat {
unit,
sum,
minimum,
maximum,
mean,
variance,
stdev,
P1,
P2,
P5,
P10,
P20,
P25,
P30,
P40,
P50,
P60,
P70,
P75,
P80,
P90,
P95,
P98,
P99,
gini,
};
} // namespace omr
[back to Arguments]
[back to topic contents]
namespace omr {
/// increment in an entity table
enum class incr {
unused,
delta,
delta2,
nz_delta,
value_in,
value_in2,
nz_value_in,
value_out,
value_out2,
nz_value_out,
};
} // namespace omr
[back to Arguments]
[back to topic contents]
content to follow.
- Windows: Quick Start for Model Users
- Windows: Quick Start for Model Developers
- Linux: Quick Start for Model Users
- Linux: Quick Start for Model Developers
- MacOS: Quick Start for Model Users
- MacOS: Quick Start for Model Developers
- Model Run: How to Run the Model
- MIT License, Copyright and Contribution
- Model Code: Programming a model
- Windows: Create and Debug Models
- Linux: Create and Debug Models
- MacOS: Create and Debug Models
- MacOS: Create and Debug Models using Xcode
- Modgen: Convert case-based model to openM++
- Modgen: Convert time-based model to openM++
- Modgen: Convert Modgen models and usage of C++ in openM++ code
- Model Localization: Translation of model messages
- How To: Set Model Parameters and Get Results
- Model Run: How model finds input parameters
- Model Output Expressions
- Model Run Options and ini-file
- OpenM++ Compiler (omc) Run Options
- OpenM++ ini-file format
- UI: How to start user interface
- UI: openM++ user interface
- UI: Create new or edit scenario
- UI: Upload input scenario or parameters
- UI: Run the Model
- UI: Use ini-files or CSV parameter files
- UI: Compare model run results
- UI: Aggregate and Compare Microdata
- UI: Filter run results by value
- UI: Disk space usage and cleanup
- UI Localization: Translation of openM++
-
Highlight: hook to self-scheduling or trigger attribute
-
Highlight: The End of Start
-
Highlight: Enumeration index validity and the
index_errors
option -
Highlight: Simplified iteration of range, classification, partition
-
Highlight: Parameter, table, and attribute groups can be populated by module declarations
- Oms: openM++ web-service
- Oms: openM++ web-service API
- Oms: How to prepare model input parameters
- Oms: Cloud and model runs queue
- Use R to save output table into CSV file
- Use R to save output table into Excel
- Run model from R: simple loop in cloud
- Run RiskPaths model from R: advanced run in cloud
- Run RiskPaths model in cloud from local PC
- Run model from R and save results in CSV file
- Run model from R: simple loop over model parameter
- Run RiskPaths model from R: advanced parameters scaling
- Run model from Python: simple loop over model parameter
- Run RiskPaths model from Python: advanced parameters scaling
- Windows: Use Docker to get latest version of OpenM++
- Linux: Use Docker to get latest version of OpenM++
- RedHat 8: Use Docker to get latest version of OpenM++
- Quick Start for OpenM++ Developers
- Setup Development Environment
- 2018, June: OpenM++ HPC cluster: Test Lab
- Development Notes: Defines, UTF-8, Databases, etc.
- 2012, December: OpenM++ Design
- 2012, December: OpenM++ Model Architecture, December 2012
- 2012, December: Roadmap, Phase 1
- 2013, May: Prototype version
- 2013, September: Alpha version
- 2014, March: Project Status, Phase 1 completed
- 2016, December: Task List
- 2017, January: Design Notes. Subsample As Parameter problem. Completed
GET Model Metadata
- GET model list
- GET model list including text (description and notes)
- GET model definition metadata
- GET model metadata including text (description and notes)
- GET model metadata including text in all languages
GET Model Extras
GET Model Run results metadata
- GET list of model runs
- GET list of model runs including text (description and notes)
- GET status of model run
- GET status of model run list
- GET status of first model run
- GET status of last model run
- GET status of last completed model run
- GET model run metadata and status
- GET model run including text (description and notes)
- GET model run including text in all languages
GET Model Workset metadata: set of input parameters
- GET list of model worksets
- GET list of model worksets including text (description and notes)
- GET workset status
- GET model default workset status
- GET workset including text (description and notes)
- GET workset including text in all languages
Read Parameters, Output Tables or Microdata values
- Read parameter values from workset
- Read parameter values from workset (enum id's)
- Read parameter values from model run
- Read parameter values from model run (enum id's)
- Read output table values from model run
- Read output table values from model run (enum id's)
- Read output table calculated values from model run
- Read output table calculated values from model run (enum id's)
- Read output table values and compare model runs
- Read output table values and compare model runs (enun id's)
- Read microdata values from model run
- Read microdata values from model run (enum id's)
- Read aggregated microdata from model run
- Read aggregated microdata from model run (enum id's)
- Read microdata run comparison
- Read microdata run comparison (enum id's)
GET Parameters, Output Tables or Microdata values
- GET parameter values from workset
- GET parameter values from model run
- GET output table expression(s) from model run
- GET output table calculated expression(s) from model run
- GET output table values and compare model runs
- GET output table accumulator(s) from model run
- GET output table all accumulators from model run
- GET microdata values from model run
- GET aggregated microdata from model run
- GET microdata run comparison
GET Parameters, Output Tables or Microdata as CSV
- GET csv parameter values from workset
- GET csv parameter values from workset (enum id's)
- GET csv parameter values from model run
- GET csv parameter values from model run (enum id's)
- GET csv output table expressions from model run
- GET csv output table expressions from model run (enum id's)
- GET csv output table accumulators from model run
- GET csv output table accumulators from model run (enum id's)
- GET csv output table all accumulators from model run
- GET csv output table all accumulators from model run (enum id's)
- GET csv calculated table expressions from model run
- GET csv calculated table expressions from model run (enum id's)
- GET csv model runs comparison table expressions
- GET csv model runs comparison table expressions (enum id's)
- GET csv microdata values from model run
- GET csv microdata values from model run (enum id's)
- GET csv aggregated microdata from model run
- GET csv aggregated microdata from model run (enum id's)
- GET csv microdata run comparison
- GET csv microdata run comparison (enum id's)
GET Modeling Task metadata and task run history
- GET list of modeling tasks
- GET list of modeling tasks including text (description and notes)
- GET modeling task input worksets
- GET modeling task run history
- GET status of modeling task run
- GET status of modeling task run list
- GET status of modeling task first run
- GET status of modeling task last run
- GET status of modeling task last completed run
- GET modeling task including text (description and notes)
- GET modeling task text in all languages
Update Model Profile: set of key-value options
- PATCH create or replace profile
- DELETE profile
- POST create or replace profile option
- DELETE profile option
Update Model Workset: set of input parameters
- POST update workset read-only status
- PUT create new workset
- PUT create or replace workset
- PATCH create or merge workset
- DELETE workset
- POST delete multiple worksets
- DELETE parameter from workset
- PATCH update workset parameter values
- PATCH update workset parameter values (enum id's)
- PATCH update workset parameter(s) value notes
- PUT copy parameter from model run into workset
- PATCH merge parameter from model run into workset
- PUT copy parameter from workset to another
- PATCH merge parameter from workset to another
Update Model Runs
- PATCH update model run text (description and notes)
- DELETE model run
- POST delete model runs
- PATCH update run parameter(s) value notes
Update Modeling Tasks
Run Models: run models and monitor progress
Download model, model run results or input parameters
- GET download log file
- GET model download log files
- GET all download log files
- GET download files tree
- POST initiate entire model download
- POST initiate model run download
- POST initiate model workset download
- DELETE download files
- DELETE all download files
Upload model runs or worksets (input scenarios)
- GET upload log file
- GET all upload log files for the model
- GET all upload log files
- GET upload files tree
- POST initiate model run upload
- POST initiate workset upload
- DELETE upload files
- DELETE all upload files
Download and upload user files
- GET user files tree
- POST upload to user files
- PUT create user files folder
- DELETE file or folder from user files
- DELETE all user files
User: manage user settings
Model run jobs and service state
- GET service configuration
- GET job service state
- GET disk usage state
- POST refresh disk space usage info
- GET state of active model run job
- GET state of model run job from queue
- GET state of model run job from history
- PUT model run job into other queue position
- DELETE state of model run job from history
Administrative: manage web-service state
- POST a request to refresh models catalog
- POST a request to close models catalog
- POST a request to close model database
- POST a request to delete the model
- POST a request to open database file
- POST a request to cleanup database file
- GET the list of database cleanup log(s)
- GET database cleanup log file(s)
- POST a request to pause model run queue
- POST a request to pause all model runs queue
- PUT a request to shutdown web-service