Skip to content

Variables with hard-coded dimension labels are not very flexible #48

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
benbovy opened this issue Aug 31, 2018 · 4 comments
Closed

Variables with hard-coded dimension labels are not very flexible #48

benbovy opened this issue Aug 31, 2018 · 4 comments

Comments

@benbovy
Copy link
Member

benbovy commented Aug 31, 2018

Some processes are generic enough to be re-used in contexts where the (space) domain of a model even differs by its number of dimensions. Take for example the following process class:

@xs.process
class SurfaceTopography(object):
    """Update a surface topography as resulting from
    multiple processes either rising or lowering elevation.
    """
    elevation = xs.variable(dims=('y', 'x'), intent='inout',
                            description='surface topography elevation')

    elevation_up_vars = xs.group('elevation_up')
    elevation_down_vars = xs.group('elevation_down')

    def run_step(self, *args):
        elevation_up = sum((v for v in self.elevation_up_vars))
        elevation_down = sum((v for v in self.elevation_down_vars))
        self.elevation_change = elevation_up - elevation_down

    def finalize_step(self):
        self.elevation += self.elevation_change

It would be nice to just re-use this class for driving the evolution of either a 1-d profile (e.g., river channel), a 2-d regular grid (raster DEM) or an irregular mesh. Unfortunately, the class above doesn't allow that flexibility because of the hard-coded dims=('y', 'x') for the elevation variable.

There are different options to overcome this limitation:

1. Use class inheritance

Create an inherited process class for each case and redefine dims for the relevant variables , e.g.,

@xs.process
class ProfileTopography(SurfaceTopography):
    """Update a profile topography as resulting from
    multiple processes either rising or lowering elevation.
    """
    elevation = xs.variable(dims='x', intent='inout',
                            description='profile topography elevation')

This has the advantage to be explicit. We can adapt the class docstrings and variable description depending on the context. A downside is that we need to do create a new process class for every case and every generic process. We might end up with a lot of classes. Also, in the example above we only need to redefine one variable but there might be processes where many variables would need to be redefined. We might end up with a lot of code duplication.

2. Set dims so that it allows all possible dimension labels

For example:

@xs.process
class SurfaceTopography(object):
    """Update a surface topography as resulting from
    multiple processes either rising or lowering elevation.
    """
    elevation = xs.variable(dims=['x', ('y', 'x')], ...)

    ...

This is straightforward (nothing specific to implement), but also confusing and error-prone. It doesn't tell which of these dimension(s) labels are actually used in the model where the process is included.

3. Set dims so that dimension labels are defined at model creation using a mapping

For example:

SPACE_DIMS = xs.DimMapping('space',
                           default=('y', 'x'),
                           choices=['x', 'nodes', ('y', 'x')])

@xs.process
class SurfaceTopography(object):
    """Update a surface topography as resulting from
    multiple processes either rising or lowering elevation.
    """
    elevation = xs.variable(dims=SPACE_DIMS, ...)

    ...

when (an instance of) the SurfaceTopography process is attached to a model, xs.DimMapping('space') will try to get and return the actual dimension labels from the 'space' key of a mapping that is somehow stuck to a Model instance. This could be done, e.g., by adding an argument to Model(), e.g.,

model_grid2d = Model({'topo': SurfaceTopography}, map_dims={SPACE_DIMS: ('y', 'x')})
model_profile1d = Model({'topo': SurfaceTopography}, map_dims={SPACE_DIMS: 'x'})
model_mesh = Model({'topo': SurfaceTopography}, map_dims={SPACE_DIMS: 'nodes'})

map_dims is optional (empty mapping by default). If 'space' is not found in mapping keys, the default value set for the xs.DimMapping object is returned. If no default value is set, then a KeyError is raised.

It might make sense to somehow stick such a mapping to a specific process class, e.g., the one that defines the grid, so that when this class is included in a model, the latter automatically use it.

This option has several advantages:

  • it scales well and it is less error-prone (just define the labels once and use them everywhere), although obviously we don't really need DimMapping for that
  • it lets the user choosing arbitrary/alternative labels when creating a model
  • for process classes that use this feature but are maintained in different code bases, it would allow something like this:
model_grid2d = Model({...}, map_dims={SPACE_DIMS: ('y', 'x'), OTHER_DIMS: ('y', 'x')})

possible issues:

  • How to propagate the mapping when building a new model from another, e.g., with Model.update_processes?
@benbovy
Copy link
Member Author

benbovy commented Aug 31, 2018

Alternative names for DimMapping:

  • ModelDim
  • SurrogateDim

@benbovy
Copy link
Member Author

benbovy commented Nov 7, 2018

Another option:

4. A class factory that replaces one or more dimension labels declared in a process class.

Such a function would take a process class + a mapping old->new dimension labels and would return a new process class with replaced labels (looking at all declared variables).

model_grid2d = Model({'topo': SurfaceTopography})

model_profile1d = Model({
    'topo': xs.replace_dims(SurfaceTopography, {('y', 'x'): 'x'})
})

model_mesh = Model({
    'topo': xs.replace_dims(SurfaceTopography, {('y', 'x'): 'nodes'})
})

This is very simple and elegant from the API point of view. To avoid broken process dependencies , the returned class may just be a subclass of the input class (if #45 is merged). This option thus looks like option 1 but with less code to write.

@benbovy
Copy link
Member Author

benbovy commented Dec 10, 2019

With option 4, xs.replace_dims could also accept a Model object (and return another Model object), for replacing dimensions throughout all the processes in a model:

model_profile1d = xs.replace_dims(model_grid2d, {('y', 'x'): 'x'})

@benbovy
Copy link
Member Author

benbovy commented Dec 13, 2019

Closing this as I'm not very satisfied by any of the options above. #44 is a better approach IMO, that lefts users do what they want (except that it wouldn't work well with auto-generated docstrings #3).

If anyone else reads this and wants to share thoughts, feel free to comment, I can reopen it later.

@benbovy benbovy closed this as completed Dec 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant