Skip to content

Implied dimensions of RVs are not broadcasted by dims or observed #5993

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aseyboldt opened this issue Jul 21, 2022 · 13 comments · Fixed by #6063
Closed

Implied dimensions of RVs are not broadcasted by dims or observed #5993

aseyboldt opened this issue Jul 21, 2022 · 13 comments · Fixed by #6063
Assignees

Comments

@aseyboldt
Copy link
Member

The dims argument is not respected when a random variable is created with parameters that broadcast to the desired output dimensions:

coords = {
     "dim1": pd.RangeIndex(10),
     "dim2": pd.RangeIndex(7)
}

with pm.Model(coords=coords) as model:
     mu = np.zeros((10, 1))
     x = pm.Normal("x", mu=mu, dims=("dim1", "dim2"))
     y = pm.Normal("y", mu=mu, dims=("dim1", "dim2"), shape=(10, 7))

x.eval().shape
# Returns (10, 1), when it should be (10, 7)

y.eval().shape
# Correctly returns (10, 7)
  • PyMC/PyMC3 Version: latest release 4.1.3
  • Aesara/Theano Version:
  • Python Version:
  • Operating system:
  • How did you install PyMC/PyMC3: conda
@bherwerth
Copy link
Contributor

Looks like shape is not checked against dims when the random variable is created. However, when sampling, this results in an error in xarray when the inference data is created:

with pm.Model(coords=coords) as model:
     mu = np.zeros((10, 1))
    #  mu = np.zeros((10, 7))   # this works
     x = pm.Normal("x", mu=mu, dims=("dim1", "dim2"))
    samples = pm.sample_prior_predictive()

fails with

ValueError: conflicting sizes for dimension 'dim2': length 1 on the data but length 7 on coordinate 'dim2'
Traceback
Traceback (most recent call last):
  File "/Users/benedikt/code/pymc/nb_examples/issue_5993.py", line 15, in <module>
    samples = pm.sample_prior_predictive()
  File "/Users/benedikt/code/pymc/pymc/sampling.py", line 2267, in sample_prior_predictive
    return pm.to_inference_data(prior=prior, **ikwargs)
  File "/Users/benedikt/code/pymc/pymc/backends/arviz.py", line 591, in to_inference_data
    ).to_inference_data()
  File "/Users/benedikt/code/pymc/pymc/backends/arviz.py", line 522, in to_inference_data
    **self.priors_to_xarray(),
  File "/Users/benedikt/code/pymc/pymc/backends/arviz.py", line 471, in priors_to_xarray
    else dict_to_dataset(
  File "/Users/benedikt/miniforge3/envs/pymc-dev/lib/python3.10/site-packages/arviz/data/base.py", line 307, in dict_to_dataset
    data_vars[key] = numpy_to_data_array(
  File "/Users/benedikt/miniforge3/envs/pymc-dev/lib/python3.10/site-packages/arviz/data/base.py", line 254, in numpy_to_data_array
    return xr.DataArray(ary, coords=coords, dims=dims)
  File "/Users/benedikt/miniforge3/envs/pymc-dev/lib/python3.10/site-packages/xarray/core/dataarray.py", line 402, in __init__
    coords, dims = _infer_coords_and_dims(data.shape, coords, dims)
  File "/Users/benedikt/miniforge3/envs/pymc-dev/lib/python3.10/site-packages/xarray/core/dataarray.py", line 152, in _infer_coords_and_dims
    raise ValueError(
ValueError: conflicting sizes for dimension 'dim2': length 1 on the data but length 7 on coordinate 'dim2'

I can also create a random variable with a shape that's inconsistent with the shape of the dims and will receive a similar error when sampling:

x = pm.Normal("x", dims=("dim1", "dim2"), shape=(2, 2))

How about resolving this by using the following logic if dims is provided?

  • If shape is not given, set shape to the value inferred from dims.
  • If shape is given, assert that shape is consistent with dims.

Perhaps this could be implemented in Distribution.__new___?

@twiecki
Copy link
Member

twiecki commented Jul 27, 2022

@bherwerth I like your idea for the fix.

@bherwerth
Copy link
Contributor

I had a closer look at the code and I see that there already is a logic in place for resizing RVs based on dims (and observed):

rv_out, dims, observed, resize_shape = _make_rv_and_resize_shape(
cls=cls, dims=dims, model=model, observed=observed, args=args, **kwargs
)
if resize_shape:
# A batch size was specified through `dims`, or implied by `observed`.
rv_out = change_rv_size(rv=rv_out, new_size=resize_shape, expand=True)

However, it seems dims only triggers a resize in case the number of dimensions change:

ndim_resize = len(sdims) - ndim_implied

To address this issue, we would need a resize also when shape changes and ndim remains the same. Given the existing mechanism of resizes, I wonder what's best:

  1. Modify the existing logic after Distribution.dist is called and edit it so that it also works for shape changes with constant ndim.
  2. Or: Set shape based on dims before the RV is created in Distribution.dist in
    rv_out = cls.dist(*args, **kwargs)

When reading the code in distribution.Distribution, I was wondering why the RV is first created with Distribution.dist and then resized later based on dims (or observed). Why is it done in a second separate step and not together with the resize in Distribution.dist? I am referring the this resize:

rv_out = change_rv_size(rv=rv_out, new_size=replicate_shape, expand=True)

@twiecki
Copy link
Member

twiecki commented Jul 30, 2022

CC @michaelosthege

@michaelosthege
Copy link
Member

This is not an easy problem.

The test case should also cover the case where the dims are initialized from MutableData:

with pm.Model() as pmodel:
    dat = pm.MutableData(
        "dat",
        np.ones((10, 7)),
        dims=("dim1", "dim2")
    )
    dlength_1 = pmodel.dim_lengths["dim1"]
    dlength_2 = pmodel.dim_lengths["dim2"]
    mu = np.zeros((10, 1))
    x = pm.Normal("y", mu=mu, dims=("dim1", "dim2"))
    y = pm.Normal("x", mu=mu, shape=(dlength_1, dlength_2))
print(x.eval().shape)
print(y.eval().shape)

@bherwerth
Copy link
Contributor

This indeed looks like a more involved task. The current resize logic extends dimensions to the left, and I don't see a quick way to adapt it to cover broadcasting from dims. And also there are edge cases to consider like specifying dims that are not in model.dim_lenghts.

Would you agree in principle that it would make sense to determine shape from dims or observed before the call to Distribution.dist and to do all resizing in Distribution.dist? Could you explain why there is an additional resize after Distribution.dist in the current implementation? This looks a bit convoluted to me and perhaps the logic could be restructured by resizing only in one place.

@michaelosthege
Copy link
Member

At this point the main reason for the additional resize is the Ellipsis feature by which the dimensionality of a variable can be extended.
It's, cool but I never actually used it in practice. Maybe we should deprecate it again? Then one would have to specify all shape elements or dims elements every time. For shape that's annoying, but for dims that's probably the default use case anyhow.

There might be additional reasons for the resize related to observed.

@ricardoV94
Copy link
Member

ricardoV94 commented Aug 10, 2022

I think it's reasonable to take the shape from dims (if shape isn't provided). Happy to remove ellipsis as well.

I don't see why observed would conflict with any of this.

@ricardoV94
Copy link
Member

I think something similar happens with observed as brought up (but a bit unrelated to the original issue in #5992)

@bherwerth
Copy link
Contributor

Thanks for explaining and clarifying the reason for the additional resize.

I think it's reasonable to take the shape form dims (if shape isn't provided). Happy to remove ellipsis as well.

I understood @michaelosthege's suggestion to drop the ellipsis to apply if we wanted to condense the resize steps into a single one. I raised the question of why there are two resize steps when reading the code in Distribution. However, this is not directly related to the original problem. Maybe it's better to address this, if at all, separately from this issue.

Coming back to the original suggestion of inferring shape from dims, perhaps the following would be minimal change in Distribution.__new__ solving the issue while not interfering with the ellipsis feature:

        if dims is not None and "shape" not in kwargs:
            dims = convert_dims(dims)
            if len(dims) > 0 and dims[-1] != Ellipsis and all([d in model.dim_lengths for d in dims]):
                kwargs["shape"] = tuple(model.dim_lengths[d] for d in dims)

That is, we would only infer shape from dims if:

  • Ellipsis are not used (in this case the user's intent would anyway be to 'extend' implied dimensions to the left rather than to broadcast)
  • And specified dimensions are available in model.dim_lengths.

If you agree with this proposal, I can add a test case and create a PR.

@ricardoV94
Copy link
Member

ricardoV94 commented Aug 11, 2022

Just to give the big picture, the problem is that we wrongly assumed that dims cannot provide new shape information for dimensions that are already implied by the parameters. So we simply created an RV and then only looked at dims for dimensions that were missing on the left for resizing. However, we need to resize all dimensions because dims may broadcast the dimensions implied by the parameters as well.

This is equally true to resize from observed I think.

@ricardoV94 ricardoV94 changed the title Dims is not respected when broadcasting parameters RVs implied dimensions are not broadcasted by dims or observed Aug 11, 2022
@ricardoV94 ricardoV94 changed the title RVs implied dimensions are not broadcasted by dims or observed Implied dimensions of RVs are not broadcasted by dims or observed Aug 11, 2022
@michaelosthege
Copy link
Member

[...] the problem is that we wrongly assumed that dims cannot provide new shape information for dimensions that are already implied by the parameters.

this ☝️

The code snippet @bherwerth posted looks fine on the first glance, but to systematically fix this we also have to consider the case where there's a mix of already-known and not-yet-known dims.

And ultimately we have to update our decision of what is the intended behavior:

  • Should dims take priority over shape?
  • Should dims take priority over size?
  • What should happen if an already-known-dim has a different length than the corresponding entry in shape or size? How do we detect that? Do we detect this at all, or will dims just silently override whatever shape/size may say?

One solution to clean up this behavior-mess could be to replace shape, size, dims by one kwarg that is typed Sequence[Optional[Union[str, int, Variable]]].

IMO we should thoroughly think this through before touching it.


As a general advice to everybody: The best way to avoid shape issues (like the one this issue originally was about) is to not rely on "smart" broadcasting, but do all broadcasts/resizes intentionally.

@ricardoV94
Copy link
Member

ricardoV94 commented Aug 11, 2022

About precedence: size/shape > dims > observed

Because they go from more flexible to more inflexible. Size/shape can be an arbitrary symbolic expression, dims and observed are usually static but can be resized, and observed is a bit more implicit and restricted to likelihoods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants