-
Notifications
You must be signed in to change notification settings - Fork 1.7k
graph: add demo for exotic graph types #4469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: The graphs plugin is most frequently used to visualize run graphs, but supports more than that. This patch adds a demo that generates data for all the data types that the graphs plugin currently supports. Test Plan: Run `bazel run //tensorboard/plugins/graph:graphs_demo`, then point TensorBoard at `/tmp/graphs_demo`. Open the graphs dashboard and inspect the contents of the `/data/plugin/graphs/info` response. Note that it contains: - tags with op graph but no other data (`keras/train:batch_2`) - tags with both profile and op graphs (`profile:prof_f`) - tags with conceptual graph only (`keras/train:keras`) - runs with `run_graph` set to `true` (`tagged`) - tags with profile data only (`tagged:step_0000`) …corresponding to the five cases of `info_impl`. wchargin-branch: graph-demo wchargin-source: 7af67c0eb106780fffa76fffe805b9953a85958c
Summary: Events with top-level `tagged_run_metadata` are now transformed at read time into blob sequence summaries. As a result, the graph plugin can get all its data via tensor summaries. This also includes replacing calls to `multiplexer.Graph` with reads of the `__run_graph__` tensor. Since there’s an existing `PLUGIN_NAME_RUN_METADATA`, I had initially hoped to use that for the summaries. But the graphs plugin actually expects that data to include a graph (even though there is a separate plugin called `PLUGIN_NAME_RUN_METADATA_WITH_GRAPH`) and there’s no way to tell it apart from just the metadata. Thus, we implement this via a new plugin name, which means that there are few structural changes to the graph plugin code. Test Plan: All data from the graphs demo (added in #4469) still works. Grepping for `_multiplexer' in the graphs plugin code shows that the only calls are to `PluginRunToTagToContent` and `Tensors`, both of which have equivalents in the data provider API. wchargin-branch: graph-data-provider-compatible wchargin-source: e9f45a6529f7d07a725d26e60bef13914ae68a41
|
||
|
||
def profile(): | ||
"""Create data with op graphs and profile data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it tiny bit odd that the graph_demo is creating profile which is a separate data structure used by a different plugin. If you expect to see profile information in RunMetadata with profile=True
, I believe the new profiler is separate from RunMetadata and will be useless (it creates profile trace files but do not populate the RunMetadata like L54). Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only way that I could find to generate data that lives under
the graph_run_metadata
plugin. It’s written in an unexported function,
summary_ops_v2.run_metadata
, which looks to only be called in this
code path.
Is there a different way that I can test this functionality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. For the trace_on(graph=True, profiler=False)
, you are relying on the Keras callback? Since we do not closely control the TB Keras callback, I think it makes more sense to explicitly exercise
_PLUGIN_NAME_RUN_METADATA_WITH_GRAPH = "graph_run_metadata_graph" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that’s right. Sure, I suppose there’s no harm in doing that.
|
||
|
||
def profile(): | ||
"""Create data with op graphs and profile data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. For the trace_on(graph=True, profiler=False)
, you are relying on the Keras callback? Since we do not closely control the TB Keras callback, I think it makes more sense to explicitly exercise
_PLUGIN_NAME_RUN_METADATA_WITH_GRAPH = "graph_run_metadata_graph" |
wchargin-branch: graph-demo wchargin-source: ee184fba262102cb7977574a553ae0927b4748f4
with tf.summary.create_file_writer(logdir).as_default(): | ||
for step in range(3): | ||
tf.summary.trace_on(profiler=True) | ||
print(f(step).numpy()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for more correct looking graph, you need to do tf.constant(step)
here and not pass Python number as input to tf.function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn’t work:
diff --git a/tensorboard/plugins/graph/graphs_demo.py b/tensorboard/plugins/graph/graphs_demo.py
index 2329b2581..38dd41cb4 100644
--- a/tensorboard/plugins/graph/graphs_demo.py
+++ b/tensorboard/plugins/graph/graphs_demo.py
@@ -123,3 +123,3 @@ def profile():
tf.summary.trace_on(profiler=True)
- print(f(step).numpy())
+ print(f(tf.constant(step)).numpy())
tf.summary.trace_export("prof_f", step=step, profiler_outdir=logdir)
TypeError: in user code:
/HOMEDIR/.cache/bazel/_bazel_wchargin/52a95bbdd50941251730eb33b7476a66/execroot/org_tensorflow_tensorboard/bazel-out/k8-opt/bin/tensorboard/plugins/graph/graphs_demo.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/graph/graphs_demo.py:115 f *
return tf.constant(i) + tf.constant(i)
/VIRTUAL_ENV/lib/python3.8/site-packages/tensorflow/python/framework/constant_op.py:264 constant **
return _constant_impl(value, dtype, shape, name, verify_shape=False,
/VIRTUAL_ENV/lib/python3.8/site-packages/tensorflow/python/framework/constant_op.py:281 _constant_impl
tensor_util.make_tensor_proto(
/VIRTUAL_ENV/lib/python3.8/site-packages/tensorflow/python/framework/tensor_util.py:457 make_tensor_proto
_AssertCompatible(values, dtype)
/VIRTUAL_ENV/lib/python3.8/site-packages/tensorflow/python/framework/tensor_util.py:334 _AssertCompatible
raise TypeError("Expected any non-tensor type, got a tensor instead.")
As written, the graph looks okay to me:
And I don’t think that it matters too much what exactly the graph looks
like; I’m mostly trying to check that the data gets plumbed through
properly. It would be cool to have a set of demos as test cases for
weird graph rendering issues, but this isn’t meant to fill that need.
Summary: This patch replaces all use of the multiplexer in the graphs plugin with use of the data provider API. The graphs plugin could already read run graphs from the data provider API; now, it can also read op graphs, Keras conceptual graphs, profiling metadata, and all that good stuff. This isn’t gated behind a feature flag because the functionality is much less heavily used, so it wouldn’t be an “everything on fire” situation if some edge case breaks. Test Plan: All data from the graphs demo (added in #4469) still works. Grepping for `multiplexer` in the graphs plugin code shows no results. wchargin-branch: graph-data-provider-only wchargin-source: c5e220148d1e792c1ebeedfcd28df4f33a384438
Summary: Events with top-level `tagged_run_metadata` are now transformed at read time into blob sequence summaries. As a result, the graph plugin can get all its data via tensor summaries. This also includes replacing calls to `multiplexer.Graph` with reads of the `__run_graph__` tensor. Since there’s an existing `PLUGIN_NAME_RUN_METADATA`, I had initially hoped to use that for the summaries. But the graphs plugin actually expects that data to include a graph (even though there is a separate plugin called `PLUGIN_NAME_RUN_METADATA_WITH_GRAPH`) and there’s no way to tell it apart from just the metadata. Thus, we implement this via a new plugin name, which means that there are few structural changes to the graph plugin code. As written, the code still uses the multiplexer, but is set up to make it easy to read from the data provider instead. We make that change in a follow-up PR (see #4473). Test Plan: All data from the graphs demo (added in #4469) still works. Grepping for `_multiplexer` in the graphs plugin code shows that the only calls are to `PluginRunToTagToContent` and `Tensors`, both of which have equivalents in the data provider API. wchargin-branch: graph-data-provider-compatible
Summary: This patch replaces all use of the multiplexer in the graphs plugin with use of the data provider API. The graphs plugin could already read run graphs from the data provider API; now, it can also read op graphs, Keras conceptual graphs, profiling metadata, and all that good stuff. This isn’t gated behind a feature flag because the functionality is much less heavily used, so it wouldn’t be an “everything on fire” situation if some edge case breaks. Test Plan: All data from the graphs demo (added in #4469) still works. Grepping for `multiplexer` in the graphs plugin code shows no results. wchargin-branch: graph-data-provider-only
Summary: This patch adds support for TF 1.x `tagged_run_metadata` events. Because this is a new top-level event type, the change extends into the `run` module as well as `data_compat`, but the changed surface area is still rather small. Test Plan: The graphs demo added in #4469 includes tagged run metadata graphs. They now appear in the graphs dashboard with `--load_fast`, and include compute time information. wchargin-branch: rust-tagged-run-metadata wchargin-source: e8ed2e7af25aba1206bccf77218edaf231b1c858
Summary: This patch adds support for TF 1.x `tagged_run_metadata` events. Because this is a new top-level event type, the change extends into the `run` module as well as `data_compat`, but the changed surface area is still rather small. Test Plan: The graphs demo added in #4469 includes tagged run metadata graphs. They now appear in the graphs dashboard with `--load_fast`, and include compute time information. wchargin-branch: rust-tagged-run-metadata
Summary:
The graphs plugin is most frequently used to visualize run graphs, but
supports more than that. This patch adds a demo that generates data for
all the data types that the graphs plugin currently supports.
Test Plan:
Run
bazel run //tensorboard/plugins/graph:graphs_demo
, then pointTensorBoard at
/tmp/graphs_demo
. Open the graphs dashboard and inspectthe contents of the
/data/plugin/graphs/info
response. Note that itcontains:
keras/train:batch_2
)profile:prof_g
)profile:prof_f
)keras/train:keras
)run_graph
set totrue
(tagged
)RunMetadata
(tagged:step_0000
)…corresponding to the five cases of
info_impl
.wchargin-branch: graph-demo