FlatSwitch Op for logprob derivation of arbitrary censoring #6949

shreyas3156 · 2023-10-12T06:47:57Z

What is this PR about?
This PR defines a FlatSwitch Op that aims to extract the intervals and their respective encodings required to infer the logprob of arbitrary censored distributions. It achieves this in the following steps:

Extract the intervals defined by the condition in pt.switch()recursively.
Adjust/limit these intervals to eliminate the overlap from the outer switch.
Identify the intervals that the true and false branches correspond to since each condition splits the space into two parts.

It then checks that we don't allow the broadcastability of a switch condition or any measurable branches, and if all the measurable components have the same source of measurability. The logic for these checks is based on #6834.

Once the intervals and their respective encodings are known, they can be used to calculate the log-probability. So, on running something like

rv2 = pt.switch(
    base_rv < -1,
    -1,
    pt.switch(
        base_rv < 1,  # -inf to 2, 2 to inf
        1,
        base_rv
    ),
)

we get something like:

lower: -1.0
upper: 1.0
encoding: 1 

lower: 1.0
upper: inf
encoding: normal_rv{0, (0, 0), floatX, False}.out

TO-DO:

The checks should work not only on pt.switch(x>0, x, a) but also on pt.switch(pt.exp(x)>0, pt.exp(x), b), where a and b are some encodings.
In the FlatSwitch Op, add base_rv, intervals list and the corresponding encodings as inputs to the node so that they can be unpacked in the logprob calculation.

Checklist

Explain important implementation details 👆
Make sure that the pre-commit linting/style checks pass.
Link relevant issues (preferably in nice commit messages)
Are the changes covered by tests and docstrings?
Fill out the short summary sections 👇

@ricardoV94 @larryshamalama

codecov · 2023-10-12T07:02:04Z

Codecov Report

Attention: Patch coverage is 17.91045% with 110 lines in your changes are missing coverage. Please review.

Project coverage is 87.21%. Comparing base (244fb97) to head (b5f26a4).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6949      +/-   ##
==========================================
- Coverage   92.26%   87.21%   -5.06%     
==========================================
  Files         100      100              
  Lines       16880    17009     +129     
==========================================
- Hits        15574    14834     -740     
- Misses       1306     2175     +869

Files	Coverage Δ
pymc/logprob/censoring.py	`34.23% <17.91%> (-61.47%)`	⬇️

... and 21 files with indirect coverage changes

ricardoV94 · 2024-03-04T10:32:34Z

@shreyas3156 could you solve the conflicts issue? I'll finally review this one :)

ricardoV94 · 2024-03-13T09:39:19Z

One of the pre-existing tests is failing, not sure if due to the changes but would guess so?

https://github.com/pymc-devs/pymc/actions/runs/8169791795/job/22334588152?pr=6949#step:7:478

ricardoV94

Looks good, I left comments about need for docstrings and rename some stuff / remove TODO comments

ricardoV94 · 2024-03-13T09:43:02Z

pymc/logprob/censoring.py

@@ -238,3 +260,282 @@ def round_logprob(op, values, base_rv, **kwargs):
    from pymc.math import logdiffexp

    return logdiffexp(logcdf_upper, logcdf_lower)
+
+
+class FlatSwitches(Op):


Add docstrings explaining what this Op does / where it is used for. Most importantly what is the IR representation this Op uses for what kind of original graphs. This can be done either here or in the main rewrite. If on the main rewrite, just mention here to check out the docstring in the rewrite.

ricardoV94 · 2024-03-13T09:43:24Z

pymc/logprob/censoring.py

+MeasurableVariable.register(FlatSwitches)
+
+
+def get_intervals(binary_node, valued_rvs):


Add docstrings explaining what this does, possibly also input/output type hints

ricardoV94 · 2024-03-13T09:43:46Z

pymc/logprob/censoring.py

+    return [interval_true, interval_false]
+
+
+def adjust_intervals(intervals, outer_interval):


Add docstrings and possibly type hints

ricardoV94 · 2024-03-13T09:44:54Z

pymc/logprob/censoring.py

+
+
+@node_rewriter(tracks=[switch])
+def find_measurable_flat_switch_encoding(fgraph: FunctionGraph, node: Node):


Suggestion, because "flat" is the IR output, not what is being found?

Suggested change

def find_measurable_flat_switch_encoding(fgraph: FunctionGraph, node: Node):

def find_nested_switch_encoding(fgraph: FunctionGraph, node: Node):

ricardoV94 · 2024-03-13T09:45:25Z

pymc/logprob/censoring.py

+    encodings, intervals = [], []
+    rv_idx = ()
+
+    # TODO: Some alternative cleaner way to do this


What is dirty about this approach? If so add comment, otherwise remove TODO?

ricardoV94 · 2024-03-13T09:45:50Z

pymc/logprob/censoring.py

+
+
+@_logprob.register(FlatSwitches)
+def flat_switches_logprob(op, values, base_rv, *inputs, **kwargs):


Suggested change

def flat_switches_logprob(op, values, base_rv, *inputs, **kwargs):

def nested_switch_encoding_logprob(op, values, base_rv, *inputs, **kwargs):

Also add some comment in the docstrings about the kind of logp graphs we are generating?

ricardoV94 · 2024-03-13T09:47:07Z

pymc/logprob/censoring.py

+        encodings, pt.eq(pt.unique(encodings, axis=0).shape[0], len(encodings))
+    )
+
+    # TODO: We do not support the encoding graphs of discrete RVs yet


Remove easy to miss TODO. Either mention in docstrings, or add a NotImplementedError that will make sure we don't forget to update the logp function?

ricardoV94 · 2024-03-13T09:47:44Z

pymc/logprob/censoring.py

+        logcdf_interval_bounds[1, ...], logcdf_interval_bounds[0, ...]
+    )  # (encoding, *base_rv.shape)
+
+    # default logprob is -inf if there is no RV in branches


Explain what's happening in the first if branch as well?

ricardoV94 · 2024-03-13T09:48:37Z

pymc/logprob/censoring.py

+    # Possible TODO:
+    # encodings = op.get_encodings_from_inputs(inputs)


Either implement or remove TODO since it's not a high priority anyway?

ricardoV94 · 2024-03-13T09:50:21Z

pymc/logprob/censoring.py

@@ -238,3 +260,282 @@ def round_logprob(op, values, base_rv, **kwargs):
    from pymc.math import logdiffexp

    return logdiffexp(logcdf_upper, logcdf_lower)
+
+
+class FlatSwitches(Op):


May be more intuitive?

Suggested change

class FlatSwitches(Op):

class NestedEncodingSwitches(Op):

shreyas3156 marked this pull request as draft October 12, 2023 06:48

shreyas3156 force-pushed the logprob-flatswitch-arbitrary-censoring branch from 86ae712 to 89d4635 Compare November 23, 2023 15:39

shreyas3156 force-pushed the logprob-flatswitch-arbitrary-censoring branch from 44e5423 to 96e0bb0 Compare December 12, 2023 16:11

shreyas3156 marked this pull request as ready for review January 5, 2024 05:39

shreyas3156 mentioned this pull request Feb 5, 2024

Logprob derivation for switch-encoding graphs #6834

Closed

ricardoV94 changed the title ~~FlatSwitch Op for logprob derivation of arbitrary censoring [WIP]~~ FlatSwitch Op for logprob derivation of arbitrary censoring Mar 4, 2024

shreyas3156 and others added 22 commits March 6, 2024 03:21

create methods for interval extraction and overlap adjustment

f302dad

add broadcastability and measurability checks for switch condition

4e17948

add checks for any measurable components in the branches

197fecb

get_measurability_source returns the set of all sources

53ae39c

Add SymbolicRandomVariable as an ancestor_var candidate

ae1808d

allow only base one base_rv in the entire graph

15d07f6

disallow discrete RVs

f42db78

remove check for non-empty base_rv

5641e14

add single broadcastability check

993ca5d

configure the output dtype of FlatSwitch Op

0706789

broadcastability check in every recursion not necessary

4b09202

temporary print statements

5c15ed2

fetch base_var from the switch condition

a7b19f3

fix issue with adding encoding when one of the branches is switch

190eb5f

unpack intervals and encodings as the inputs to FlatSwitch Op

b4a4b9d

verify if the base RVs are the same in all branches

b7090c1

specify output dtype and shape of FlatSwitch Op

2bd2ff8

remove redundant checks for source of measurability and base_rv

79bd2d4

remove meta_info since info is passed as the op inputs

d175906

Add logprob calculations

2ceec95

Vectorize logcdf for all the interval bounds

f89be08

Broadcast the intervals

d52cd26

“Shreyas and others added 16 commits March 6, 2024 03:23

Add indices of branches with RVs in Op property

5771620

Tests for single and double switch

5440095

test for arbitrary censoring with 3 switch branches

d2789b0

Refactor logp

2000795

Add comment

5e03ddb

Modify logp calculation to check for rv branch before encodings

6070bc7

add rv_idx to props

3d0a309

Fix axis in pt.unique to ignore broadcasted encodings

318206d

Remove pytest parametrize to only compile the logp once

cf99a3f

Add broadcastability tests

3955280

Add tests to check measurability source and denying discrete

5022313

Handle None returns from flat_switch_helper()

7cb7ff6

Modify default logprob when there is no RV in any branch

e7530ef

Change the output dtype of FlatSwitch Op

252e742

Tests for no measurable variable in any branch

9d85980

Test fix-up for nested branches

db20cef

shreyas3156 force-pushed the logprob-flatswitch-arbitrary-censoring branch from ebcee22 to db20cef Compare March 6, 2024 08:24

Pre-commit fix

b5f26a4

ricardoV94 reviewed Mar 13, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FlatSwitch Op for logprob derivation of arbitrary censoring #6949

FlatSwitch Op for logprob derivation of arbitrary censoring #6949

shreyas3156 commented Oct 12, 2023 •

edited

Loading

codecov bot commented Oct 12, 2023 •

edited

Loading

ricardoV94 commented Mar 4, 2024 •

edited

Loading

ricardoV94 commented Mar 13, 2024

ricardoV94 left a comment

ricardoV94 Mar 13, 2024

ricardoV94 Mar 13, 2024

ricardoV94 Mar 13, 2024

ricardoV94 Mar 13, 2024

ricardoV94 Mar 13, 2024

ricardoV94 Mar 13, 2024

ricardoV94 Mar 13, 2024

ricardoV94 Mar 13, 2024

ricardoV94 Mar 13, 2024

ricardoV94 Mar 13, 2024

ricardoV94 Mar 13, 2024

		MeasurableVariable.register(FlatSwitches)


		def get_intervals(binary_node, valued_rvs):

		return [interval_true, interval_false]


		def adjust_intervals(intervals, outer_interval):



		@node_rewriter(tracks=[switch])
		def find_measurable_flat_switch_encoding(fgraph: FunctionGraph, node: Node):

	def find_measurable_flat_switch_encoding(fgraph: FunctionGraph, node: Node):
	def find_nested_switch_encoding(fgraph: FunctionGraph, node: Node):



		@_logprob.register(FlatSwitches)
		def flat_switches_logprob(op, values, base_rv, inputs, *kwargs):

	def flat_switches_logprob(op, values, base_rv, inputs, *kwargs):
	def nested_switch_encoding_logprob(op, values, base_rv, inputs, *kwargs):

		# Possible TODO:
		# encodings = op.get_encodings_from_inputs(inputs)

FlatSwitch Op for logprob derivation of arbitrary censoring #6949

Are you sure you want to change the base?

FlatSwitch Op for logprob derivation of arbitrary censoring #6949

Conversation

shreyas3156 commented Oct 12, 2023 • edited Loading

codecov bot commented Oct 12, 2023 • edited Loading

Codecov Report

ricardoV94 commented Mar 4, 2024 • edited Loading

ricardoV94 commented Mar 13, 2024

ricardoV94 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shreyas3156 commented Oct 12, 2023 •

edited

Loading

codecov bot commented Oct 12, 2023 •

edited

Loading

ricardoV94 commented Mar 4, 2024 •

edited

Loading