Skip to content

ENH: Customizable plate labels #7319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
williambdean opened this issue May 15, 2024 · 11 comments · Fixed by #7392
Closed

ENH: Customizable plate labels #7319

williambdean opened this issue May 15, 2024 · 11 comments · Fixed by #7392

Comments

@williambdean
Copy link
Contributor

williambdean commented May 15, 2024

Before

# Currently would have to create graphviz instance
graph = pm.model_to_graphviz(model)
# Save output to .dot file and manually modify subgraph labels
print(graph)
# Resave as pdf using dot cli 

After

# Might be similar to that of where the entry point to the change would be in `pm.model_to_graphviz`

def named_vars_plate_label(
    var: TensorVariable, 
    named_vars_to_dims: dict[str, Sequence[str]], 
    dim_lengths: dict[str, int],
) -> str: 
    dims = named_vars_to_dims[var.name]
    return " x ".join(str(dim) for dim in dims)
    
plate_formatters = {
    "Named Vars": named_vars_plate_label,
}
pm.model_to_graphviz(model, plate_formatters=plate_formatters)

Context for the issue:

I am interested in making graphviz diagraphs with a bit more customization of the plate labels than currently supported. Specifically, I would like to allow for labels without the shape size to provide a more abstract representation of the model. My specific example might look like this:
example-graph

I am willing to take up this issue. Wondering if there are any preferences on the implementation details? My idea would involve extracting this logic here and providing some logic based on:

  • var : TensorVariable
  • named_vars_to_dims : var.name mapping to coord names
  • dim_lengths : coord names mapping to either TensorVariable or pre-evaluated shapes (addressing this TODO)

Related to #6716

@ricardoV94
Copy link
Member

How does this differ from #7302 ?

@williambdean
Copy link
Contributor Author

williambdean commented May 15, 2024

How does this differ from #7302 ?

7302 is styling of the nodes and this is name of the labels for each plate (label in bottom of each rectangle)

In the above example it would show the dim size. Ie obs (52) x covariate (5)

@ricardoV94
Copy link
Member

ricardoV94 commented May 15, 2024

I feel that may be going too far? Can it be done after a canonical graph was created by PyMC from the outside?

Or anyway that a user can have that flexibility without we having to handle it

@ricardoV94
Copy link
Member

Regarding shape size specifically we can disable that with a boolean flag, if that's all you need?

@williambdean
Copy link
Contributor Author

I feel that may be going too far? Can it be done after a canonical graph was created by PyMC from the outside?
Or anyway that a user can have that flexibility without we having to handle it

Yes, fair. I agree. Ideally that'd be something to process afterwards. However, based on current understanding of graphviz package is that would not be possible. I could be wrong but it seems like much of the information is stored in pre-formatted strings and pieced into larger dot format. reference

I suspect it might be possible through the networkx objects. i.e. pymc model -> networkx -> modifications -> graphviz

Regarding shape size specifically we can disable that with a boolean flag, if that's all you need?

Yes, I think that that is a fine middle ground. You are thinking a single boolean? For example,

# Default resulting in plate labels like obs (52) x covariates (3)
pm.model_to_graphviz(model)
# Optional boolean to result in plate labels without number in parentheses. i.e. obs x covariates
pm.model_to_graphviz(model, include_shape_size=False)

@ricardoV94
Copy link
Member

The networkx path may not be too crazy?

@williambdean
Copy link
Contributor Author

The networkx path may not be too crazy?

Will give it a try and report back with findings.

I trying to avoid parsing and reconstruction of the strings that we create

@williambdean
Copy link
Contributor Author

williambdean commented Jun 22, 2024

I explore this a bit and I wasn't able to get the result that I wanted.

The cons of it:

  • networkx converts to pygraphviz instead of graphviz which requires an additional dependency
  • Plate information seems to be lost after it is converted via networkx
  • Also, parsing of labels would have to happen regardless

Hard to say there were any pros 😆

I think an include_shape_size boolean flag would be the easiest way to go. What are your thoughts on this? @ricardoV94

@ricardoV94
Copy link
Member

ricardoV94 commented Jun 22, 2024

Ideally we would have a very bare bones utility that gives users the model tree structure and meta info and let them do whatever they want with it.

Our graphviz gen could use this utility, but the code would be simple enough for someone to just copy and tweak without we having to offer customization for all the visual elements that accidentally became the default?

If that feels like too much work, that's fair and we can add one more kwarg. But since people are becoming interested in this I worry about the next round of feature requests.

WDYT?

@williambdean
Copy link
Contributor Author

williambdean commented Jun 22, 2024

Ideally we would have a very bare bones utility that gives users the model tree structure and meta info and let them do whatever they want with it.

Our graphviz gen could use this utility, but the code would be simple enough for someone to just copy and tweak without we having to offer customization for all the visual elements that accidentally became the default?

If that feels like too much work, that's fair and we can add one more kwarg. But since people are becoming interested in this I worry about the next round of feature requests.

WDYT?

Yeah, this is a bit of what I had in mind with the initial solution and where I was headed with the #7302 PR. That there would be some underlying generator(s) that could just provide the people with the information that they'd like. Also, that would be used to support the current methods (seems like the graphviz and networkx methods are a bit intertwined and might have some consolidation / reliance on this underlying utility).

I can think about the implementation a bit. Do you have any thoughts on how you'd like it implemented?

@ricardoV94
Copy link
Member

I can think about the implementation a bit. Do you have any thoughts on how you'd like it implemented?

No, but very curious what you can come up with!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants