RFC: TPU SavedModel Export API for TF2 #171

lzr-official · 2019-11-06T22:56:20Z

Review period open through 2019-11-23

TPU SavedModel Export API for TF2.x

Status	Proposed
Author(s)	Zhuoran Liu ([email protected]), Youlong Cheng ([email protected])
Sponsor	Jonathan Hseu ([email protected])
Updated	2019-11-06

Objective

Provide an API to allow TF2 users to export TPU saved models for
inference, which:

Provide a user-friendly way to specify which function to run on TPU;
Hides Graph construction and TPU inference specific logic (multi-core
support, etc) from users;
Allows specifying tags in SavedModel.

Motivation

Limitation of current `tf.saved_model.save()`

MetaGraphDef allows saving customized tags. Current downstream components like
TPU model-server, TFX infra-validator use the tags to load the specific
MetaGraph. However tf.saved_model.save() does not allow users to specify the set
of tags in MetaGraphDef, but hard-coded the MetaGraph to have only one ‘serve’
tag.

karmel

Can we do this in a way that doesn't require peppering tpu kwargs throughout existing APIs?

karmel · 2019-11-08T21:54:56Z

rfcs/20191106-tf2-tpu-savedmodel.md

+tf.keras.models.save_model(
+    model,
+    filepath='...',
+    export_to_tpu=True)


I would like to avoid having tpu-specific kwargs added to the Keras paths. Let's think of a better way to do this? Can it be controlled from the dist strat side, where we have TPU-awareness? CC @fchollet , @k-w-w .

If it's possible on the dist strat side, I prefer this idea over adding new arguments to the function.

+1 hardcoding one particular architecture into save_model seems unfortunate and seems to be encoding device specific information at a very high level (also then why not a export_to_gpu, export_to_npu, export_to_ipu, ...)

karmel · 2019-11-08T21:57:01Z

rfcs/20191106-tf2-tpu-savedmodel.md

+    one MetaGraph, which has ‘serve’ tag hard-coded.
+
+    `tags` is an optional argument. It is a Python iterable, representing the
+    list of tags for MetaGraph. This allows user to specify customized tags.


Can you explain why this is necessary? If the joint tags are being assigned to a single metagraph... why bother? Shouldn't the signatures be sufficient? I would hope that tags can fade away in 2.0, because they are confusing.

karmel · 2019-11-08T21:58:37Z

rfcs/20191106-tf2-tpu-savedmodel.md

+    passed through to the place where PartitionedCallOp is created. Originally
+    all stateful functions will generate StatefulPartitionedCallOp. Now we
+    switch to TPUPartitionedCallOp, and this routing is done by checking the
+    value of `use_tpu_partitioned_call`.


I would prefer the details of TPU partitioned calls didn't leak all the way up to the tf.function interface. Can we think of a better way? CC @jaingaurav

karmel · 2019-11-08T21:59:17Z

rfcs/20191106-tf2-tpu-savedmodel.md

+1.  `export_to_tpu`: Simply setting this to `True` will export TPU model;
+2.  `tags_signatures`: Optionally for advanced users, if they want to have more
+    control of what tags / signatures they are using, they can use this argument
+    as if they are using TF2.x saving API.


I believe (2) was removed, but (1) is still too much.

Sorry 2 was a mistake. It should be tags.
For 1, what suggestions do you have on how to let this method know we are exporting to TPU model?

rfcs/20191106-tf2-tpu-savedmodel.md

Co-Authored-By: Edd Wilder-James <[email protected]>

googlebot · 2019-11-09T00:36:46Z

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

googlebot · 2019-11-09T00:46:26Z

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

rfcs/20191106-tf2-tpu-savedmodel.md

Co-Authored-By: Edd Wilder-James <[email protected]>

googlebot · 2019-11-09T01:07:12Z

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

brijk7 · 2019-12-05T21:19:50Z

@ewilderj : what are the next steps on this RFC? Looks like the comment period is over, and the design review is done as well.

ewilderj · 2019-12-05T21:21:17Z

Public design review meeting notes should be posted to this PR, and the PR updated to change status to "Approved" (assuming it was approved.) Once that's done, can be merged.

brijk7 · 2019-12-07T00:29:11Z

@lzr-google : can you please provide an update from the design review?

lzr-official · 2019-12-07T00:57:59Z

Hi, we didn't reach full agreement on the proposed design in its current state. More work will be needed before we have a conclusion.
Thanks!

Since this API change has been approved and checked in, I now update this doc with the final accepted design and switch status to 'accepted'.

lzr-official · 2020-02-05T23:38:46Z

Update on the status of this RFC: After changing the design according to review feedback, the new design as in the latest commit has been accepted, with change checked in at tensorflow/tensorflow@ee1dcbb .

lzr-official and others added 2 commits November 6, 2019 14:51

TF2 TPU SavedModel RFC Doc Draft

523f495

Update 20191106-tf2-tpu-savedmodel.md

aa14014

lzr-official requested review from ewilderj, goldiegadde, martinwicke and theadactyl as code owners November 6, 2019 22:56

googlebot added the cla: yes label Nov 6, 2019

karmel suggested changes Nov 8, 2019

View reviewed changes

ewilderj reviewed Nov 9, 2019

View reviewed changes

rfcs/20191106-tf2-tpu-savedmodel.md Outdated Show resolved Hide resolved

Update rfcs/20191106-tf2-tpu-savedmodel.md: Set RFC#

4a6fa51

Co-Authored-By: Edd Wilder-James <[email protected]>

googlebot added cla: no and removed cla: yes labels Nov 9, 2019

ewilderj changed the title ~~TPU SavedModel Export API for TF2 - RFC~~ RFC: TPU SavedModel Export API for TF2 Nov 9, 2019

ewilderj added cla: yes RFC: Proposed RFC Design Document and removed cla: no labels Nov 9, 2019

ewilderj reviewed Nov 9, 2019

View reviewed changes

rfcs/20191106-tf2-tpu-savedmodel.md Outdated Show resolved Hide resolved

Update rfcs/20191106-tf2-tpu-savedmodel.md

b83d545

Co-Authored-By: Edd Wilder-James <[email protected]>

googlebot added cla: no and removed cla: yes labels Nov 9, 2019

lzr-official and others added 3 commits November 13, 2019 12:10

Update: 2019-11-13 12:10 pm

474d984

Update: 2019-11-13 12:15 pm

9008400

Update 20191106-tf2-tpu-savedmodel.md

0bfae5b

Update 20191106-tf2-tpu-savedmodel.md

b269f13

Since this API change has been approved and checked in, I now update this doc with the final accepted design and switch status to 'accepted'.

lzr-official requested a review from ematejska as a code owner February 5, 2020 19:38

Update 20191106-tf2-tpu-savedmodel.md

f0999ed

lzr-official requested a review from jhseu February 5, 2020 23:26

ematejska approved these changes Feb 5, 2020

View reviewed changes

ematejska merged commit 15fab62 into tensorflow:master Feb 5, 2020

ematejska added RFC: Accepted RFC Design Document: Accepted by Review and removed RFC: Proposed RFC Design Document labels Feb 28, 2020

RFC: TPU SavedModel Export API for TF2 #171

RFC: TPU SavedModel Export API for TF2 #171

Uh oh!

Conversation

lzr-official commented Nov 6, 2019 • edited by ewilderj Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TPU SavedModel Export API for TF2.x

Objective

Motivation

Limitation of current tf.saved_model.save()

Uh oh!

karmel left a comment

Choose a reason for hiding this comment

Uh oh!

karmel Nov 8, 2019

Choose a reason for hiding this comment

Uh oh!

k-w-w Nov 13, 2019

Choose a reason for hiding this comment

Uh oh!

jpienaar Nov 25, 2019

Choose a reason for hiding this comment

Uh oh!

karmel Nov 8, 2019

Choose a reason for hiding this comment

Uh oh!

karmel Nov 8, 2019

Choose a reason for hiding this comment

Uh oh!

karmel Nov 8, 2019

Choose a reason for hiding this comment

Uh oh!

lzr-official Nov 9, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

googlebot commented Nov 9, 2019

Uh oh!

googlebot commented Nov 9, 2019

Uh oh!

Uh oh!

googlebot commented Nov 9, 2019

Uh oh!

brijk7 commented Dec 5, 2019

Uh oh!

ewilderj commented Dec 5, 2019

Uh oh!

brijk7 commented Dec 7, 2019

Uh oh!

lzr-official commented Dec 7, 2019

Uh oh!

lzr-official commented Feb 5, 2020

Uh oh!

Uh oh!

lzr-official commented Nov 6, 2019 •

edited by ewilderj

Loading

Limitation of current `tf.saved_model.save()`