Skip to content

updated high level documentation #154

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
185 changes: 126 additions & 59 deletions docs/source/core_concepts.rst
Original file line number Diff line number Diff line change
@@ -1,90 +1,157 @@
=================================
Core Concepts
=================================

Rompy is a Python library for generating ocean model control files and required input
data ready for ingestion into the model. The framework is separated into two broad
concepts:


.. autosummary::
:nosignatures:
:toctree: _generated/

rompy.model.ModelRun
rompy.core.BaseConfig

There is information about each of these in the documentation of each object, but at a
high level, ModelRun is the high level framework that renders the config object and controls the
period of which the model is run, and the config object is responsible for producing
model configuration.

If we consider a very simple case using the `BaseConfig` class. This is not inteded to
do anything except provide a bass class on which to implement a specific model,
however, is is functional and can be used to demonstrate core concepts.
.. -*- mode: rst -*-

=================
Core Concepts
=================

Core objects
------------
Rompy provides a framework for generating ocean model input files and managing simulation setup. It revolves around two primary concepts: defining *what* to run (Configuration) and *how/when* to run it (Runtime).

Grid
^^^^
Model Runtime (`ModelRun`)
--------------------------
The :py:class:`~rompy.model.ModelRun` class orchestrates the entire process for a specific simulation instance. It defines:

Grids form a core component of any model. Rompy provides a base class for grids, and a
regular grid class. Support for other grid types will be added in the future.
* **`run_id`**: A unique identifier for the simulation run.
* **`output_dir`**: The base directory where simulation files will be generated.
* **`period`**: A :py:class:`~rompy.core.time.TimeRange` object specifying the start time, duration/end time, and interval for the simulation.
* **`config`**: An instance of a model-specific configuration class (subclass of :py:class:`~rompy.core.config.BaseConfig`).

When executed (e.g., by calling `run()`), `ModelRun` combines its runtime parameters with the `config` object and uses a templating engine (`cookiecutter`) to generate the necessary model input files within a structured directory (`output_dir/run_id`).

.. autosummary::
:nosignatures:
:toctree: _generated/

rompy.core.grid.BaseGrid
rompy.core.grid.RegularGrid
rompy.model.ModelRun

Model Configuration (`BaseConfig` and Subclasses)
-------------------------------------------------
The :py:class:`~rompy.core.config.BaseConfig` class, and its model-specific subclasses (e.g., :py:class:`~rompy.swan.config.SwanConfigComponents`, :py:class:`~rompy.schism.config.SCHISMConfig`, :py:class:`~rompy_xbeach.config.Config`), define the static aspects of a model setup using `Pydantic`. This includes:

Data
^^^^
* Model parameters (physics options, numerical schemes).
* Spatial grid definitions.
* Input forcing specifications.
* Output requirements.
* A reference to the `cookiecutter` template used for generating input files.

Data objects are used to represent data inputs into the model. Rompy provides the
following base classes for data objects:
These configuration objects ensure that settings are type-checked, validated, and can be easily created from Python dictionaries or loaded from YAML/JSON files, promoting a declarative approach to model setup. Model-specific configurations translate these structured settings into the syntax required by the target model (e.g., SWAN command components, SCHISM namelists, XBeach parameter files).

.. autosummary::
:nosignatures:
:toctree: _generated/

rompy.core.data.DataBlob
rompy.core.data.DataGrid
rompy.core.source.SourceBase
rompy.core.source.SourceFile
rompy.core.source.SourceIntake


Boundary
^^^^^^^^
rompy.core.config.BaseConfig

.. autosummary::
:nosignatures:
:toctree: _generated/
Supporting Objects
------------------

rompy.core.boundary.BoundaryWaveStation
rompy.core.source.SourceWavespectra
Several core objects support the `ModelRun` and `Config` classes:

**Grid Definitions (`rompy.core.grid`, model-specific grid modules)**
Represent the spatial domain and discretization of the model.

Spectrum
^^^^^^^^
* **Core Grids:**
* :py:class:`~rompy.core.grid.BaseGrid`: Abstract base defining minimal interface (coordinate properties, `bbox()`, `boundary()` methods).
* :py:class:`~rompy.core.grid.RegularGrid`: Concrete implementation for standard rectangular grids (origin, rotation, spacing, dimensions).
* **Model-Specific Grids:**
* Plugins define their own grid types inheriting from `BaseGrid` or `RegularGrid` to add model-specific parameters or methods (e.g., :py:class:`~rompy.swan.grid.SwanGrid`, :py:class:`~rompy.schism.grid.SCHISMGrid`).
* The XBeach plugin :py:class:`~rompy_xbeach.grid.RegularGrid` extends the core `RegularGrid` with CRS handling and an `alfa` (rotation) parameter specific to XBeach conventions.

.. autosummary::
:nosignatures:
:toctree: _generated/

rompy.core.spectrum.LogFrequency


Model Run
---------------

rompy.core.grid.BaseGrid
rompy.core.grid.RegularGrid
rompy.swan.grid.SwanGrid
rompy.schism.grid.SCHISMGrid
rompy_xbeach.grid.RegularGrid

**Data Handling and Forcing (`rompy.core.source`, `rompy.core.data`, `rompy.core.boundary`, `rompy.core.filters`, model-specific data modules)**
Manages acquisition, processing, and formatting of model input data (e.g., bathymetry, wind, boundary conditions). This uses a layered approach:

* **Source Objects (The "Where"):** Define *where* the raw data comes from.
* :py:class:`~rompy.core.source.SourceBase`: Abstract base class.
* Core implementations handle origins like local files (:py:class:`~rompy.core.source.SourceFile`), intake catalogs (:py:class:`~rompy.core.source.SourceIntake`), Datamesh (:py:class:`~rompy.core.source.SourceDatamesh`), existing datasets (:py:class:`~rompy.core.source.SourceDataset`), spectral files (:py:class:`~rompy.core.source.SourceWavespectra`), CSV/DataFrames (:py:class:`~rompy.core.source.SourceTimeseriesCSV`, :py:class:`~rompy.core.source.SourceTimeseriesDataFrame`).
* Plugins can define additional sources tailored to specific model needs or data types (e.g., :py:class:`~rompy_xbeach.source.SourceGeotiff` for geospatial rasters, :py:class:`~rompy_xbeach.source.SourceCRSOceantide` for tidal constituents). These often add CRS awareness.
* The `open()` method returns an `xarray.Dataset`.

.. autosummary::
:nosignatures:
:toctree: _generated/

rompy.core.source.SourceBase
rompy.core.source.SourceFile
rompy.core.source.SourceIntake
rompy.core.source.SourceDatamesh
rompy.core.source.SourceDataset
rompy.core.source.SourceWavespectra
rompy.core.source.SourceTimeseriesCSV
rompy.core.source.SourceTimeseriesDataFrame
rompy_xbeach.source.SourceGeotiff
rompy_xbeach.source.SourceCRSOceantide

* **Data Objects (The "What" and "How"):** Define *what* data is needed and *how* it should be processed.
* :py:class:`~rompy.core.data.DataGrid`: Central class for gridded data. Holds a `Source` object, specifies `variables`, `coords` mapping, and `filters`. Manages automatic spatial/temporal cropping based on the model `grid` and `period` via the `crop_data` flag and buffers. It manages the 'What' (variables, coordinates) and 'How' (filters, source) aspects of data preparation. The `ds` property provides the processed `xarray.Dataset`.
* :py:class:`~rompy.core.boundary.DataBoundary`: Specializes `DataGrid` for boundary conditions. Adds `spacing` and `sel_method` for selecting points along the model boundary.
* :py:class:`~rompy.core.data.DataPoint`: Simplified version for timeseries/point data.
* :py:class:`~rompy.core.data.DataBlob`: Basic file/directory handler (copy or link).

.. autosummary::
:nosignatures:
:toctree: _generated/

rompy.core.data.DataGrid
rompy.core.boundary.DataBoundary
rompy.core.data.DataPoint
rompy.core.data.DataBlob

* **Filter Object (Processing):**
* :py:class:`~rompy.core.filters.Filter`: Applies transformations like sorting, subsetting, cropping, renaming, and deriving variables to the dataset loaded by the `Source`. Automatically updated by `DataGrid` if `crop_data` is enabled.

.. autosummary::
:nosignatures:
:toctree: _generated/

rompy.core.filters.Filter

* **Model-Specific Data Objects (Formatting):**
* While core `DataGrid` and `DataBoundary` handle sourcing and filtering data into a standardized `xarray.Dataset`, model-specific subclasses handle the final step: **formatting and writing** this data into the files required by the target model. They override the `get()` method to perform this task. Examples include:
* :py:class:`~rompy.swan.data.SwanDataGrid`: Writes processed data into SWAN ASCII grid files.
* :py:class:`~rompy.schism.data.SfluxAir`: Writes atmospheric data into SCHISM's sflux NetCDF format.
* :py:class:`~rompy_xbeach.data.XBeachBathy`: Handles geospatial rasters (via `SourceGeotiff`), interpolates using specified methods (like :py:class:`~rompy_xbeach.interpolate.RegularGridInterpolator`), potentially extends the grid seaward (e.g., using :py:class:`~rompy_xbeach.data.SeawardExtensionLinear`), and writes the final bathymetry into XBeach's specific format (xdata.txt, ydata.txt, bathy.txt).
* :py:class:`~rompy_xbeach.boundary.BoundaryStationSpectraJons`: Selects spectral data from stations, calculates JONSWAP parameters, and writes boundary conditions to XBeach JONS format files (either a single file or a filelist).
* :py:class:`~rompy_xbeach.forcing.WindGrid`: Selects gridded wind data (potentially calculating speed/direction from U/V components using :py:class:`~rompy_xbeach.forcing.WindVector` or :py:class:`~rompy_xbeach.forcing.WindScalar`) and writes it to the XBeach time-varying wind file format.

.. autosummary::
:nosignatures:
:toctree: _generated/

rompy.swan.data.SwanDataGrid
rompy.schism.data.SfluxAir
rompy.schism.data.SCHISMDataOcean
rompy.swan.boundary.Boundnest1
rompy_xbeach.data.XBeachBathy
rompy_xbeach.boundary.BoundaryStationSpectraJons
rompy_xbeach.forcing.WindGrid

**Time Definition:**
Specifies simulation periods and intervals.
.. autosummary::
:nosignatures:
:toctree: _generated/

rompy.model.ModelRun
rompy.core.time.TimeRange

Workflow Summary
----------------

1. **Define Configuration:** Create a model-specific `Config` object (e.g., `SwanConfigComponents`, `SCHISMConfig`, `rompy_xbeach.config.Config`) defining the grid, physics, data requirements (using `DataGrid`, `DataBoundary`, etc., each containing a `Source`), outputs, etc.
2. **Define Runtime:** Create a `ModelRun` instance, specifying the `run_id`, `output_dir`, simulation `period`, and the `config`.
3. **Generate:** Call `model.generate()`. This triggers the `get()` method on each data object within the `config`. The `get()` method:
* Optionally updates its internal `crop` filter based on the `ModelRun`'s `period` and the `Config`'s `grid`.
* Accesses its `ds` property, which loads data via the `Source` object and applies all defined `filters`.
* Writes the processed data to the model-specific format in the staging directory (`output_dir/run_id`).
* The `cookiecutter` template is rendered using `runtime` and `config` data, embedding paths to the generated input files.
4. **Model Plugins:** Model plugins (like `rompy-swan`, `rompy-schism`, `rompy-xbeach`) provide the specific `Config`, `Grid`, and `Data` subclasses needed for their respective models, fitting seamlessly into this core workflow.
5. **Execute (External):** Run the ocean model executable using the files in the staging directory.
6. **Analyze:** Analyze model output.
41 changes: 23 additions & 18 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -1,47 +1,52 @@
. -*- mode: rst -*-

=================================
Welcome to rompy's documentation!
=================================

*Taking the pain out of ocean model setup*

**This library is in early prototype stage and the interfaces are likely to change**

This library takes an opinionated approach to combining the functionality of the cookie-cutter library (https://github.com/cookiecutter/cookiecutter) with the XArray ecosystem (http://xarray.pydata.org/en/stable/related-projects.html) and intake (https://github.com/intake/intake) for data to aid in the configuration and evaluation of coastal scale numerical models.

There are two base classes BaseModel and BaseGrid. BaseModel implements the cookie-cutter code and model configuration packaging. BaseGrid defines a loose definition of the grid as two arrays of x, y points that establish the models geographic extents, bounding box and convex hull.
**Status:** This library is under active development. While functional, interfaces may evolve.

At present only one example model has been implemented - the SwanModel (http://swanmodel.sourceforge.net/). An example cookie-cutter template for swan is provided in the ```rompy/templates``` folder.
.. figure:: /_static/logo.svg
:align: center
:alt: rompy logo
:width: 400px

A model implementation will generally consist of the following components:
Introduction
------------

1. A model class that inherits from BaseModel and implements the minimal interface. At present only a private ```_get_grid()``` method.
2. A grid class that inherits from BaseGrid and implements the minimal interface of either loading the grid from file or a model specific grid specification string
3. An XArray accessor that has methods that translate an XArray dataset into a model specific input file format (usually some bespoke text file format). This allows convenient namespacing of methods from an XArray dataset e.g.:
Relocatable Ocean Modelling in PYthon (rompy) is a Python framework designed to streamline the configuration, execution, and analysis of coastal and ocean numerical models. It addresses the often complex and model-specific nature of setting up simulations by providing:

``` ds.swan.to_inpgrid(filename) ```
* **Structured Configuration:** Leverages `Pydantic` models for defining model settings (including spatial grids, physics, forcing sources, outputs) in a clear, type-safe, and validated manner. Configurations can be defined programmatically in Python or declaratively via YAML/JSON files.
* **Templated Setup:** Utilizes the `cookiecutter` engine to generate model-specific input files and directory structures based on the defined configuration and runtime parameters (e.g., simulation period).
* **Abstracted Data Handling:** Integrates with `xarray`, `intake`, `fsspec`, and `oceanum`'s Datamesh to provide flexible ways to source, filter, and process input forcing data (bathymetry, wind, boundary conditions, etc.) required by the models.
* **Workflow Orchestration:** The central `ModelRun` class manages the simulation lifecycle, combining runtime information (like the simulation period) with a model-specific `Config` object to generate a complete, ready-to-run model setup.
* **Extensibility:** Designed with a plugin architecture using Python's `entry_points`, allowing users and developers to easily add support for new models or data sources.

The final main component of the library is an intake driver that builds on the intake-xarray.DataSourceMixin and allows for the stacking of multiple model forecast datasets that are typically published in netCDF format on THREDDS/OpenDAP servers. The unique feature of the driver include:
`rompy` facilitates setup for models including:

1. The ability to use format strings in the urlpath and pass a dictionary of values for the format keys. The product of the dictionary values is expanded to a set of URLs that are scanned checked for existence using the fsspec library. This allows for scanning of both local filesystems and http servers in a targetted fashion, for example a specific date range of interest.
2. The subset of urls identified are opened with XArray with a preprocessing function that takes a dictionary of filters for common operations that are applied during pre-processing - allowing this to be parameterised in the intake catalog yaml entry for a specific dataset.
3. The result is either a stack of model forcasts normalised to an initialisation and lead time (hindcast=false), or a pseudo-reanalysis that selects the shortest lead-time for each time point in the stack.
* **SWAN:** A detailed, component-based configuration mirroring SWAN's command structure (via the `rompy-swan` package).
* **SCHISM:** Support including both a minimal configuration and a comprehensive namelist-based approach (via the `rompy-schism` package).
* **XBeach:** Configuration and input generation (via the `rompy-xbeach <https://github.com/rom-py/rompy-xbeach>`_ plugin).

The goal is to provide a unified, Pythonic interface for diverse ocean models, promoting reproducibility, efficiency, and automation in modelling workflows.

.. toctree::
:hidden:
:maxdepth: 4
:maxdepth: 2

Home <self>
quickstart
core_concepts
models
demo
api
relational_diagrams
# relational_diagrams (Keep if relevant)

Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
* :ref:`search`
Loading