Interactive cli for python converter #393

pyu10055 · 2019-07-01T20:52:25Z

It is a command tool that guides the user through the converter options.
Here is an updated video for the cli.

This change is

nsthorat · 2019-07-01T21:55:01Z

Ping, amazing work! This is incredible. Few high level points:

I wonder if we should have this in a small design doc because it would give more people an opportunity to look at this (a lot of people have used the converter now!)
Some of the things it would be great to do automatically! Like can we detect the input format and print "We found a TensorFlow SavedModel!"
It's unclear what "tags" are, can we automatically print them all?
What is a signature? Can we print that too?
When you ask about compression it would be great to show the options with details (e.g. "1) 4x compression, highest accuracy loss 2) 2x compression, some accuracy loss)
It would be good to give a little explainer for each of the questions, e.g. What is op validation? What does stripping debug ops do?

pyu10055

Thanks,

will generate a design doc
since different model requires difference things (saved model needs directory, keras needs a file), I would think let the user choose the format first would be better. (the cli does create different questions according to the input format selected).
tag/signature list are loaded from the saved model, and shown as options, users just need to select one.
agree, will add
will do

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq, @dsmilkov, and @nsthorat)

nsthorat · 2019-07-02T14:23:26Z

Let's talk over there! I think we still should detect the input format :)

pyu10055

Updated the workflow of the CLI according to the comments, and a new video is uploaded, https://drive.google.com/open?id=1hWabsvY4d3GdDJxUlBwMiBuawIL1ulfI

PTAL.

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq, @dsmilkov, and @nsthorat)

dsmilkov

Took a first pass. Good work. A concern I have is about the third_party dependency.

Reviewed 2 of 7 files at r1, 1 of 3 files at r2.
Reviewable status: 0 of 1 approvals obtained (waiting on @caisq, @dsmilkov, @nsthorat, and @pyu10055)

python/requirements.txt, line 7 at r2 (raw file):

tensorflow==1.14.0
tensorflow-hub==0.5.0
PyInquirer==1.0.3

this is a bit problematic since the library is not in third_party yet and we want to sync the python converter internally. Let's sync it to third_party before moving forward, so we know it's possible. Otherwise, how about using Prompt-Toolkit (v2) which is already in third party: https://github.com/prompt-toolkit/python-prompt-toolkit/tree/2.0

python/tensorflowjs/cli.py, line 24 at r2 (raw file):

from PyInquirer import prompt
from examples import custom_style_3

where does the examples namespace come from?

python/tensorflowjs/cli.py, line 27 at r2 (raw file):

from tensorflow.python.saved_model.loader_impl import parse_saved_model
from tensorflow.core.framework import types_pb2
from tensorflowjs.converters.converter import main as convert

not a big fan of aliasing since it's harder to follow which method is actually called. let's rename main to convert in converter.py. See comment below.

python/tensorflowjs/cli.py, line 42 at r2 (raw file):

def quantization_type(value):

s/value/user_selection_quant/ for readability

python/tensorflowjs/cli.py, line 45 at r2 (raw file):

  """Determine the quantization type based on user selection.
  Args:
    value: user selected value.

add returns section

python/tensorflowjs/cli.py, line 58 at r2 (raw file):

def of_values(answers, key, values):

the name hides a bit what this function does. Let's call it value_in_list(value, list) and do the dict lookup outside to make the code more readable: value_in_list(answers[key], valid_answers)

python/tensorflowjs/cli.py, line 72 at r2 (raw file):

def getTFJSModelType(file):

snake case?

python/tensorflowjs/cli.py, line 78 at r2 (raw file):

def detect_input_format(answers):

instead of storing a temporary variable inside answers[DETECTED_INPUT_FORMAT], let's have this method return the detected_input_format and normalized_input_path as a result. Then pass those results where you need it. This way you don't need to ignore "special answers" when you generate flags later.
Do the overriding of answers[input_path] = normalized_input_path outside this method.

Also have input_path as the only input to this function - easier to understand what this function actually needs.

python/tensorflowjs/cli.py, line 107 at r2 (raw file):

  """Determine question for model's input path.
  Args:
    answer: Dict of user's answers to the questions

this method signature is a bit misleading since you do not need the whole dictionary. Pass the minimal info this function needs. In this case input_format

python/tensorflowjs/cli.py, line 119 at r2 (raw file):

def validate_input_path(value, input_format):

s/value/input_path/

python/tensorflowjs/cli.py, line 130 at r2 (raw file):

  if input_format == TF_HUB:
    if re.match(URL_REGEX, value) is None:
      return 'This is not an valid url for TFHub module: %s' % value

Add more info to the end. "We expect a URL that starts with http(s)://"

python/tensorflowjs/cli.py, line 135 at r2 (raw file):

  if input_format in [KERAS_SAVED_MODEL, TF_SAVED_MODEL]:
    if not os.path.isdir(value):
      return 'The path provided is not a directory: %s' % value

let's be more flexible and work with a path to ".pb" as well, not just a directory

python/tensorflowjs/cli.py, line 137 at r2 (raw file):

      return 'The path provided is not a directory: %s' % value
    if not any(fname.endswith('.pb') for fname in os.listdir(value)):
      return 'This is an invalid saved model directory: %s' % value

let's make the error more concrete: Did not find a .pb file inside the model's directory" (also make sure you remove "saved model" since this is also used for keras.

python/tensorflowjs/cli.py, line 140 at r2 (raw file):

  if input_format == TFJS_LAYERS_MODEL:
    if not os.path.isfile(value):
      return 'The path provided is not a file: %s' % value

To be consistent with saved models, let's accept both a directory that has a json file inside, or a json file directly.

python/tensorflowjs/cli.py, line 151 at r2 (raw file):

  Args:
    value: input path of the model.
    input_format: model format string.

missing returns

python/tensorflowjs/cli.py, line 164 at r2 (raw file):

  """generate the tensorflowjs command string for the selected params.
  Args:
    params: user selected parameters for the conversion.

missing returns

python/tensorflowjs/cli.py, line 182 at r2 (raw file):

def is_saved_model(answers):

pass in minimal viable info, namely input_format and output_format.

python/tensorflowjs/cli.py, line 342 at r2 (raw file):

      'filter': os.path.expanduser,
      'validate':
          lambda value: 'Please enter a valid path' if not value else True

lambda path:

python/tensorflowjs/cli.py, line 454 at r2 (raw file):

    for entry in it:
      print(entry.name,
            os.path.getsize(os.path.join(params['output_path'], entry.name)))

the entry which is of type DirEntry already has a stat property on it, which has file size, among other info: https://docs.python.org/3/library/os.html#os.DirEntry.stat

python/tensorflowjs/converters/converter.py, line 495 at r2 (raw file):

    return parser.parse_args()

def main(arguments=None):

let's rename this function to convert, and add another main which calls it

python/tensorflowjs/converters/converter.py, line 561 at r2 (raw file):

  elif (input_format == 'tf_hub' and
        output_format == 'tfjs_graph_model'):
    print(FLAGS)

remove print statement

nsthorat

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq, @nsthorat, and @pyu10055)

python/tensorflowjs/cli.py, line 29 at r2 (raw file):

from tensorflowjs.converters.converter import main as convert
# regex for recognizing valid url for TFHub module.
URL_REGEX = re.compile(

how about TFHUB_VALID_URL_REGEX

caisq · 2019-07-16T15:20:08Z

@pyu10055 It just occurred to me, but you might want to verify that PyInquirer works in notebook environments such as Jupyter Notebook and CoLab. From our experience, we know many of our users do their Python model training and subsequent conversion in notebooks. It would not be ideal if the interactive CLI doesn't work in those environments.

…er command line

pyu10055

Decision made offline:

will not sync the interactive cli to g3
will not try to support notebook or colab to run interactive cli directly, will add a dry-run mode to generate the command line.
continue use pyInquirer library.

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq and @dsmilkov)

python/requirements.txt, line 7 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

this is a bit problematic since the library is not in third_party yet and we want to sync the python converter internally. Let's sync it to third_party before moving forward, so we know it's possible. Otherwise, how about using Prompt-Toolkit (v2) which is already in third party: https://github.com/prompt-toolkit/python-prompt-toolkit/tree/2.0

resolved offline see comment above.

python/tensorflowjs/cli.py, line 24 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

where does the examples namespace come from?

removed

python/tensorflowjs/cli.py, line 58 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

the name hides a bit what this function does. Let's call it value_in_list(value, list) and do the dict lookup outside to make the code more readable: value_in_list(answers[key], valid_answers)

thanks for the suggestion, one thing that this method also do is to check if key is a valid key, otherwise call need to do the checking, which could be quite some places.

python/tensorflowjs/cli.py, line 182 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

pass in minimal viable info, namely input_format and output_format.

done

python/tensorflowjs/cli.py, line 454 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

the entry which is of type DirEntry already has a stat property on it, which has file size, among other info: https://docs.python.org/3/library/os.html#os.DirEntry.stat

thanks

caisq

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq, @dsmilkov, and @pyu10055)

README.md, line 60 at r3 (raw file):

- Regular conversion script: tensorflowjs_converter

State the fact that both come with pip install tensorflowjs.

README.md, line 72 at r3 (raw file):

r own script,

Style nit: respect 80-char line limit whenever possible.

python/tensorflowjs/init.py, line 24 at r3 (raw file):

from tensorflowjs import cli

Style: the import lines should be sorted lexicographically.

python/tensorflowjs/BUILD, line 137 at r3 (raw file):

"//tensorflowjs:expect_tensorflow_installed",

For consistency, you should add a new dummy dependency called epxect_PyInquirer_installed, following the pattern of expect_tensorflow_installed.

python/tensorflowjs/cli.py, line 24 at r3 (raw file):

from PyInquirer import prompt, style_from_dict, Token

Style: importing individual symbols are against Google Python style. Instead, import the entire module. See:
https://github.com/google/styleguide/blob/gh-pages/pyguide.md#224-decision

python/tensorflowjs/cli.py, line 37 at r3 (raw file):

# Token.Selected: '',  # default

Should this line be removed?

python/tensorflowjs/cli.py, line 45 at r3 (raw file):

})

KERAS_SAVED_MODEL = 'keras_saved_model'

Should converter.py use these constants as well for consistency and future-proofness? If you do that, perhaps these should be refactored into a shared small Python file.

python/tensorflowjs/cli.py, line 49 at r3 (raw file):

tfjs_keras_model

Should this be tfjs_graph_model?

python/tensorflowjs/cli.py, line 63 at r3 (raw file):

1/2'

Why not be consistent with the terminology we've used elsewhere, i.e., '8-bit' and '16-bit'? Introducing new terminology is confusing IMO.

python/tensorflowjs/cli.py, line 92 at r3 (raw file):

detect_

Some of the logic in the function are not nearly 100% bulletproof. So I think this function should be called "guess_input_format" instead.

python/tensorflowjs/cli.py, line 106 at r3 (raw file):

detected_input_format = TF_SAVED_MODEL

How reliable is this logic? Remember that keras H5 files are single files. So a user could have a file in a directory with many other files, and some of those other files may happen to be some unrelated protobuf (.pb) file. (Imagine someone's download folder). Is there a more unique signature of the .pb file from a TF SavedModel?

python/tensorflowjs/cli.py, line 117 at r3 (raw file):

'.hdf5'

The extension name for a HDF5 is often .h5. In fact, it can be any extension name. The HDF5 library doesn't require any specific extensions. I think you should try creating an HDF5 object in order to determine this condition with higher reliability.

python/tensorflowjs/cli.py, line 122 at r3 (raw file):

    elif (input_path.endswith('.json') and
          get_tfjs_model_type(input_path) == 'layers-model'):

You can try loading the .json file with Python's json module and check the fields such as generatedBy and convertedBy.

python/tensorflowjs/cli.py, line 123 at r3 (raw file):

detected_input_format = TFJS_LAYERS_MODEL

Should there be an else branch that throws an Error saying that it can't automatically guess the type of the input?

python/tensorflowjs/cli.py, line 144 at r3 (raw file):

nit: --> Upper case "V". Same below.

python/tensorflowjs/cli.py, line 153 at r3 (raw file):

re.match(TFHUB_VALID_URL_REGEX, path) is None

You can just do if not re.match(TFHUB_VALID_URL_REGEX, path):.

python/tensorflowjs/cli.py, line 158 at r3 (raw file):

nit: Use tuple, instead of list for cases like this.

python/tensorflowjs/cli.py, line 167 at r3 (raw file):

return 'The path provided is not a directory or .json file: %s' % path

Can we require a .json file (i.e., reject a directory) for the case of input_format == TFJS_LAYERS_MODEL? We want to be consistent with tensorflowjs_converter.

python/tensorflowjs/cli.py, line 207 at r3 (raw file):

return 'The output path already exists: %s' % output_path

In some cases, the user may want to overwrite an existing file (e.g., in the tfjs_layers_model --> keras option). Should that be allowed?

python/tensorflowjs/cli.py, line 220 at r3 (raw file):

'split_weights_by_layer'

It'll be great if these flag names can be refactored to be shared between cli.py and converter.py to reduce future maintenance cost.

python/tensorflowjs/cli.py, line 254 at r3 (raw file):

key': 'g'

Need a comment to describe what 'key' means.

python/tensorflowjs/cli.py, line 385 at r3 (raw file):

Upper-case "C" here?

python/tensorflowjs/cli.py, line 385 at r3 (raw file):

Weclome

Typo

python/tensorflowjs/cli.py, line 504 at r3 (raw file):

print('converter command generated:')

IMO, the generated command should always be printed, no matter whether this is a dry run.

python/tensorflowjs/converters/converter.py, line 437 at r3 (raw file):

setup_arguments(arguments=None):

Add a doc string explaining what happens when arguments is provided and what happens when it's not.

pyu10055

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq, @dsmilkov, and @pyu10055)

README.md, line 60 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

- Regular conversion script: tensorflowjs_converter
State the fact that both come with pip install tensorflowjs.

the pip install is stated in the step 1.

README.md, line 72 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

r own script,
Style nit: respect 80-char line limit whenever possible.

Done.

python/tensorflowjs/init.py, line 24 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

from tensorflowjs import cli
Style: the import lines should be sorted lexicographically.

Done.

python/tensorflowjs/BUILD, line 137 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

"//tensorflowjs:expect_tensorflow_installed",
For consistency, you should add a new dummy dependency called epxect_PyInquirer_installed, following the pattern of expect_tensorflow_installed.

Done.

python/tensorflowjs/cli.py, line 24 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

from PyInquirer import prompt, style_from_dict, Token
Style: importing individual symbols are against Google Python style. Instead, import the entire module. See:
https://github.com/google/styleguide/blob/gh-pages/pyguide.md#224-decision

Done.

python/tensorflowjs/cli.py, line 37 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

# Token.Selected: '',  # default
Should this line be removed?

Done.

python/tensorflowjs/cli.py, line 45 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

Should converter.py use these constants as well for consistency and future-proofness? If you do that, perhaps these should be refactored into a shared small Python file.

moved the format name and argument string to common.py, and shared between cli and converter

python/tensorflowjs/cli.py, line 49 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

tfjs_keras_model
Should this be tfjs_graph_model?

thanks

python/tensorflowjs/cli.py, line 63 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

1/2'
Why not be consistent with the terminology we've used elsewhere, i.e., '8-bit' and '16-bit'? Introducing new terminology is confusing IMO.

If we are talking about compression, it is easier for user to visualize 1/2 the size, instead of 8bit.

python/tensorflowjs/cli.py, line 92 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

detect_
Some of the logic in the function are not nearly 100% bulletproof. So I think this function should be called "guess_input_format" instead.

I have update the logic, please check again.

python/tensorflowjs/cli.py, line 106 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

detected_input_format = TF_SAVED_MODEL
How reliable is this logic? Remember that keras H5 files are single files. So a user could have a file in a directory with many other files, and some of those other files may happen to be some unrelated protobuf (.pb) file. (Imagine someone's download folder). Is there a more unique signature of the .pb file from a TF SavedModel?

we could look for saved_model.pb, which suppose to be the name for the saved model pb file. We do parse the pb file later for the tags/signature_names, which should further validate the file.

python/tensorflowjs/cli.py, line 117 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

'.hdf5'
The extension name for a HDF5 is often .h5. In fact, it can be any extension name. The HDF5 library doesn't require any specific extensions. I think you should try creating an HDF5 object in order to determine this condition with higher reliability.

Done.

python/tensorflowjs/cli.py, line 122 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

    elif (input_path.endswith('.json') and
          get_tfjs_model_type(input_path) == 'layers-model'):
You can try loading the .json file with Python's json module and check the fields such as generatedBy and convertedBy.

get_tfjs_model_type loads the json file and check the format field.

python/tensorflowjs/cli.py, line 123 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

detected_input_format = TFJS_LAYERS_MODEL
Should there be an else branch that throws an Error saying that it can't automatically guess the type of the input?

the default is value for detected_input_format is None.

python/tensorflowjs/cli.py, line 144 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

v
nit: --> Upper case "V". Same below.

Done.

python/tensorflowjs/cli.py, line 153 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

re.match(TFHUB_VALID_URL_REGEX, path) is None
You can just do if not re.match(TFHUB_VALID_URL_REGEX, path):.

Done.

python/tensorflowjs/cli.py, line 167 at r3 (raw file):

nit: Use tuple, instead of list for cases like this.
this is suggested by Daniel for usability, since the cli will expand path to the file, it is transparent to the user.

python/tensorflowjs/cli.py, line 207 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

return 'The output path already exists: %s' % output_path
In some cases, the user may want to overwrite an existing file (e.g., in the tfjs_layers_model --> keras option). Should that be allowed?

are you suggesting a confirmation to overwrite?

python/tensorflowjs/cli.py, line 220 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

'split_weights_by_layer'
It'll be great if these flag names can be refactored to be shared between cli.py and converter.py to reduce future maintenance cost.

yes, any suggestions? export them from the converter file?

python/tensorflowjs/cli.py, line 254 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

key': 'g'
Need a comment to describe what 'key' means.

Done.

python/tensorflowjs/cli.py, line 385 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

c
Upper-case "C" here?

Done.

python/tensorflowjs/cli.py, line 385 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

Weclome

Typo

Done.

python/tensorflowjs/cli.py, line 504 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

print('converter command generated:')
IMO, the generated command should always be printed, no matter whether this is a dry run.

Done.

python/tensorflowjs/converters/converter.py, line 437 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

setup_arguments(arguments=None):
Add a doc string explaining what happens when arguments is provided and what happens when it's not.

Done.

pyu10055

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq and @dsmilkov)

python/tensorflowjs/cli.py, line 158 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

[
nit: Use tuple, instead of list for cases like this.

Done.

caisq

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq, @dsmilkov, and @pyu10055)

python/setup.py, line 30 at r4 (raw file):

'tensorflowjs_cli

Naming question: tensorflowjs_cli is confusing in the sense that tensorflowjs_converter is also a CLI. Should we call it tensorflowjs_wizard instead? Similarly, cli.py should perhaps be called wizard.py.

python/tensorflowjs/BUILD, line 54 at r4 (raw file):

tensorflow

s/tensorflow/PyInquirer/

python/tensorflowjs/BUILD, line 132 at r4 (raw file):

":cli"

deps should be sorted lexicographihcally.

python/tensorflowjs/cli.py, line 207 at r3 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

are you suggesting a confirmation to overwrite?

If the output is a directory (in the case of tfjs_graphs_model, for example), it should just error out.
If the output is a file, yes, I think it should ask the user if they want to overwrite the existing file.

python/tensorflowjs/cli.py, line 220 at r3 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

yes, any suggestions? export them from the converter file?

You can create a new python file, perhaps called cli_common.py, to hold string constants like this that are shared between cli.py (wizard.py?) and converter.py. A new BUILD rule should be created for it too and old BUILD targets should depend on it.

python/tensorflowjs/cli.py, line 508 at r3 (raw file):

file

nit: "File(s)"

python/tensorflowjs/cli.py, line 501 at r4 (raw file):

print('tensorflowjs_converter %s' % ' '.join(arguments))

You should print an empty line or two after this, so that screen output is visually easier to parse.

python/tensorflowjs/converters/common.py, line 33 at r4 (raw file):

 model formats

style nit: Comments should start with an upper-case letter and end with a period. Same below.

pyu10055

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq and @dsmilkov)

python/setup.py, line 30 at r4 (raw file):

Previously, caisq (Shanqing Cai) wrote…

'tensorflowjs_cli 
Naming question: tensorflowjs_cli is confusing in the sense that tensorflowjs_converter is also a CLI. Should we call it tensorflowjs_wizard instead? Similarly, cli.py should perhaps be called wizard.py.

Done.

python/tensorflowjs/BUILD, line 54 at r4 (raw file):

Previously, caisq (Shanqing Cai) wrote…

tensorflow
s/tensorflow/PyInquirer/

Done.

python/tensorflowjs/BUILD, line 132 at r4 (raw file):

Previously, caisq (Shanqing Cai) wrote…

":cli"
deps should be sorted lexicographihcally.

Done.

python/tensorflowjs/converters/common.py, line 33 at r4 (raw file):

Previously, caisq (Shanqing Cai) wrote…

 model formats
style nit: Comments should start with an upper-case letter and end with a period. Same below.

Done.

python/tensorflowjs/cli.py, line 207 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

If the output is a directory (in the case of tfjs_graphs_model, for example), it should just error out.
If the output is a file, yes, I think it should ask the user if they want to overwrite the existing file.

Added a confirmation step if the output directory exists.

python/tensorflowjs/cli.py, line 220 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

You can create a new python file, perhaps called cli_common.py, to hold string constants like this that are shared between cli.py (wizard.py?) and converter.py. A new BUILD rule should be created for it too and old BUILD targets should depend on it.

Done.

python/tensorflowjs/cli.py, line 508 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

file 
nit: "File(s)"

Done.

python/tensorflowjs/cli.py, line 501 at r4 (raw file):

Previously, caisq (Shanqing Cai) wrote…

print('tensorflowjs_converter %s' % ' '.join(arguments))
You should print an empty line or two after this, so that screen output is visually easier to parse.

Done.

caisq

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq, @dsmilkov, and @pyu10055)

python/tensorflowjs/cli.py, line 63 at r3 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

If we are talking about compression, it is easier for user to visualize 1/2 the size, instead of 8bit.

I'm not convinced that 1/2 and 1/4 are better terminology. Consider the fact that when you use 16-bit quantization, the size of the model is not exactly 1/2, because

The model is not just weights; there is also topology, which is unaffected by the quantization
Even if we are just talking about the weights, it's not exactly 1/2 either, because the quantization scheme needs to store the min and the range.

So the 1/2 and 1/4 terminology only creates confusion.

python/tensorflowjs/wizard.py, line 133 at r5 (raw file):

what is the TFHub module URL?

Consider printing an example URL here for clarity.

python/tensorflowjs/wizard.py, line 510 at r5 (raw file):

not params['overwrite_output_path']

Should this be param['overwrite_output_path'] is None instead?

pyu10055

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq and @dsmilkov)

python/tensorflowjs/cli.py, line 63 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…

I'm not convinced that 1/2 and 1/4 are better terminology. Consider the fact that when you use 16-bit quantization, the size of the model is not exactly 1/2, because

The model is not just weights; there is also topology, which is unaffected by the quantization

Even if we are just talking about the weights, it's not exactly 1/2 either, because the quantization scheme needs to store the min and the range.

So the 1/2 and 1/4 terminology only creates confusion.

This method is remove, since the choice can have name and value field, here are the labels:
[{
name: 'No compression, no accuracy loss.',
value: 0
}, {
name: '2x compression, medium accuracy loss.',
value: 1
}, {
name: '4x compression, highest accuracy loss.',
value: 2
}]

python/tensorflowjs/wizard.py, line 133 at r5 (raw file):

Previously, caisq (Shanqing Cai) wrote…

what is the TFHub module URL?
Consider printing an example URL here for clarity.

Done.

python/tensorflowjs/wizard.py, line 510 at r5 (raw file):

Previously, caisq (Shanqing Cai) wrote…

not params['overwrite_output_path']
Should this be param['overwrite_output_path'] is None instead?

this field should be set to False or True when the OUTPUT_PATH is set.

caisq

@pyu10055 Thanks for addressing my comments so far. It's getting there.

Did you post a screenshot or screencast of the wizard UI somewhere?

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq and @dsmilkov)

caisq

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq, @dsmilkov, and @pyu10055)

python/tensorflowjs/wizard.py, line 511 at r6 (raw file):

after

Writing nit: after --> by.

python/tensorflowjs/wizard.py, line 512 at r6 (raw file):

os.scandir

I'm pretty sure that os.scandir is Python3 only. To the degree we plan to support python2 going forward, this should be replaced by a utility function.

python/tensorflowjs/wizard.py, line 513 at r6 (raw file):

entry.stat().st_size

In addition to printing individual file sizes, it would be nice to print the total file size as well at the end.
Can you pretty format the output, so that all file names line up at the same column. E.g., see:
https://stackoverflow.com/questions/10623727/python-spacing-and-aligning-strings

caisq

@pyu10055 Another thought (sorry for the fragmented comments. I'll make them more consolidated from here on):

Having two binaries from tensorflowjs is potentially confusing. Can we consolidate tensorflowjs_converter and tensorflowjs_wizard into one? It may still be called tensorflowjs_converter. But when user calls the binary without any flags, it'll just drop into the interactive interface.

WDYT?

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq, @dsmilkov, and @pyu10055)

pyu10055

The issue is we cannot sync the wizard into g3, have two would allow us to filter out just the wizard.

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq and @dsmilkov)

python/tensorflowjs/wizard.py, line 511 at r6 (raw file):

Previously, caisq (Shanqing Cai) wrote…

after
Writing nit: after --> by.

Done.

python/tensorflowjs/wizard.py, line 512 at r6 (raw file):

Previously, caisq (Shanqing Cai) wrote…

os.scandir
I'm pretty sure that os.scandir is Python3 only. To the degree we plan to support python2 going forward, this should be replaced by a utility function.

use listdir instead

python/tensorflowjs/wizard.py, line 513 at r6 (raw file):

Previously, caisq (Shanqing Cai) wrote…

entry.stat().st_size
In addition to printing individual file sizes, it would be nice to print the total file size as well at the end.

Can you pretty format the output, so that all file names line up at the same column. E.g., see:
https://stackoverflow.com/questions/10623727/python-spacing-and-aligning-strings

Done.

caisq

Check with @nsthorat and @smilkov about whether they have any remaining comments.

Reviewable status: complete! 1 of 1 approvals obtained (waiting on @caisq and @dsmilkov)

dsmilkov

Awesome work @pyu10055 ! Thanks for the detailed review @caisq! I left a few comments regarding the outdated info in the readme. LGTM otherwise.

Reviewed 7 of 9 files at r5, 1 of 2 files at r6, 1 of 1 files at r7.
Reviewable status: complete! 2 of 1 approvals obtained (waiting on @caisq, @dsmilkov, and @pyu10055)

README.md, line 28 at r7 (raw file):

 existing TensorFlow/Keras installation.

__Note__: *Check that [`tf-nightly-2.0-preview`](https://pypi.org/project/tf-nightly-2.0-preview/#files) is available for your platform.*

since you are already touching README.me , let's remove this sentence which seems outdated.

README.md, line 30 at r7 (raw file):

__Note__: *Check that [`tf-nightly-2.0-preview`](https://pypi.org/project/tf-nightly-2.0-preview/#files) is available for your platform.*

Most of the times, this means that you have to use Python 3.6.8 in your local

also this is outdated, you can use either python2 or python 3 (update the text below accordingly)

README.md, line 73 at r7 (raw file):

There is also dry run mode for the wizard, which will not perform the actual
conversion but only generate the command for tensorflowjs_converter command.

wrap tensorflowjs_converter in quotes

python/tensorflowjs/wizard.py, line 1 at r7 (raw file):

# Copyright 2018 Google LLC

2019

caisq

Sorry - more comments, on the screen cast at https://drive.google.com/file/d/1mKqfZJ-BnRBTdsTplL6fUBDvBOHDKKWz/view

In the "files generated by conversion" table printed at the end, the "size" column is not aligned well when the numbers.
The "size" column should say the unit e.g., "Size (Bytes)"
Consider printing an empty line between the successive questions, so that the screen output is easier to parse visually.

Reviewable status: complete! 2 of 1 approvals obtained (waiting on @caisq, @dsmilkov, and @pyu10055)

…ractive-cli

pyu10055 added 7 commits April 17, 2019 15:03

add interactive cli for python converter

696a366

Merge branch 'master' into interactive-cli

96fe586

add auto looking up the saved model tags and signatures

f0f6989

fixed pylint errors and added tests

e944f24

adding more docstrings

7e93a4f

fix typo

c9a63dc

merged master

3d9d5ac

pyu10055 requested review from nsthorat and caisq July 1, 2019 20:52

googlebot added the cla: yes label Jul 1, 2019

nsthorat requested a review from dsmilkov July 1, 2019 21:55

pyu10055 commented Jul 1, 2019

View reviewed changes

pyu10055 added 3 commits July 3, 2019 13:12

update the cli workflow according to the design doc comments

82a0648

fix pylint

91f781e

show the dtype string instead of value

ea5ed07

pyu10055 commented Jul 3, 2019

View reviewed changes

fix pylint error

093c8aa

dsmilkov reviewed Jul 9, 2019

View reviewed changes

nsthorat reviewed Jul 10, 2019

View reviewed changes

pyu10055 added 4 commits July 17, 2019 11:02

move to questionary pip to support prompt_toolkit 2

3431a07

Merge branch 'master' into interactive-cli

1d6382d

update the README and fixed the tests

0c7d70b

addressed the comments and add dryrun arg to generate the raw convert…

9acd99a

…er command line

pyu10055 commented Jul 30, 2019

View reviewed changes

caisq suggested changes Jul 30, 2019

View reviewed changes

fixed bugs

d24ad9b

pyu10055 commented Jul 30, 2019

View reviewed changes

pyu10055 added 2 commits July 30, 2019 15:57

address comments

a4ea88f

use tuple instead of list

4512586

pyu10055 commented Jul 30, 2019

View reviewed changes

caisq suggested changes Jul 31, 2019

View reviewed changes

address the comments

066af81

pyu10055 commented Jul 31, 2019

View reviewed changes

caisq suggested changes Aug 1, 2019

View reviewed changes

update compression choices

0cc44e6

pyu10055 commented Aug 1, 2019

View reviewed changes

pyu10055 added 3 commits August 1, 2019 14:01

fix pylint error

e2135a4

more pylint error

4cdcda8

Merge branch 'master' into interactive-cli

c245c3a

caisq reviewed Aug 1, 2019

View reviewed changes

caisq suggested changes Aug 2, 2019

View reviewed changes

caisq reviewed Aug 2, 2019

View reviewed changes

used listdir to support py2

89d52a1

pyu10055 commented Aug 2, 2019

View reviewed changes

update text

4995679

caisq approved these changes Aug 2, 2019

View reviewed changes

pyu10055 added 2 commits August 5, 2019 17:11

fix pylint error

725ddd2

Merge branch 'master' into interactive-cli

a0ce222

dsmilkov approved these changes Aug 6, 2019

View reviewed changes

caisq approved these changes Aug 6, 2019

View reviewed changes

pyu10055 added 6 commits August 6, 2019 09:24

addressed comments

0193b61

Merge branch 'master' into interactive-cli

bd76a7f

Merge branch 'interactive-cli' of github.com:tensorflow/web into inte…

fe61a3d

…ractive-cli

Merge branch 'master' into interactive-cli

d4bc566

Merge branch 'master' into interactive-cli

fc509d2

fixed test error

9c1d0e4

Interactive cli for python converter #393

Are you sure you want to change the base?

Interactive cli for python converter #393

Uh oh!

Conversation

pyu10055 commented Jul 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nsthorat commented Jul 1, 2019

Uh oh!

pyu10055 left a comment

Choose a reason for hiding this comment

Uh oh!

nsthorat commented Jul 2, 2019

Uh oh!

pyu10055 left a comment

Choose a reason for hiding this comment

Uh oh!

dsmilkov left a comment

Choose a reason for hiding this comment

Uh oh!

nsthorat left a comment

Choose a reason for hiding this comment

Uh oh!

caisq commented Jul 16, 2019

Uh oh!

pyu10055 left a comment

Choose a reason for hiding this comment

Uh oh!

caisq left a comment

Choose a reason for hiding this comment

Uh oh!

pyu10055 left a comment

Choose a reason for hiding this comment

Uh oh!

pyu10055 left a comment

Choose a reason for hiding this comment

Uh oh!

caisq left a comment

Choose a reason for hiding this comment

Uh oh!

pyu10055 left a comment

Choose a reason for hiding this comment

Uh oh!

caisq left a comment

Choose a reason for hiding this comment

Uh oh!

pyu10055 left a comment

Choose a reason for hiding this comment

Uh oh!

caisq left a comment

Choose a reason for hiding this comment

Uh oh!

caisq left a comment

Choose a reason for hiding this comment

Uh oh!

caisq left a comment

Choose a reason for hiding this comment

Uh oh!

pyu10055 left a comment

Choose a reason for hiding this comment

Uh oh!

caisq left a comment

Choose a reason for hiding this comment

Uh oh!

dsmilkov left a comment

Choose a reason for hiding this comment

Uh oh!

caisq left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pyu10055 commented Jul 1, 2019 •

edited

Loading