-
Notifications
You must be signed in to change notification settings - Fork 214
Init Scope #338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Init Scope #338
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is going to break Tribuo in ways that are hard to fix, because it removes the init op we use to find initializers in serialized GraphDefs and there doesn't seem to be a way to get the initialization op name back out. I think you should consider how this interacts with serialized GraphDefs which are the only way to persist a graph structure that we've got at the moment. The current init mechanism mirrors what TF Python does in terms of what gets persisted into a graphdef.
You can see how Tribuo currently uses the init ops here - https://github.com/oracle/tribuo/blob/main/Interop/Tensorflow/src/main/java/org/tribuo/interop/tensorflow/TensorFlowTrainer.java#L488 through line 514.
tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Session.java
Outdated
Show resolved
Hide resolved
tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/op/Scope.java
Outdated
Show resolved
Hide resolved
tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Session.java
Outdated
Show resolved
Hide resolved
tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Graph.java
Outdated
Show resolved
Hide resolved
tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/GraphOperationBuilder.java
Show resolved
Hide resolved
tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Session.java
Outdated
Show resolved
Hide resolved
tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/op/core/Init.java
Show resolved
Hide resolved
tensorflow-core/tensorflow-core-api/src/test/java/org/tensorflow/SavedModelBundleTest.java
Outdated
Show resolved
Hide resolved
...core/tensorflow-core-generator/src/main/java/org/tensorflow/op/generator/ClassGenerator.java
Outdated
Show resolved
Hide resolved
tensorflow-framework/src/test/java/org/tensorflow/framework/optimizers/GradientDescentTest.java
Show resolved
Hide resolved
The exporting and importing is a problem, can you link where Tribuo does the exporting? The intent was that when importing Python models, you would find the Restore op and use |
Looking more into I also made |
Tribuo uses |
Ok, GraphDef import/export is now supported, see https://github.com/rnett/java/blob/rn_init_scope/tensorflow-core/tensorflow-core-api/src/test/java/org/tensorflow/GraphTest.java#L68 |
@JimClarke5 I've only done enough updates to |
@rnett , can you rebase this PR or fix the conflicts please? |
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
…ration Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Ok, done. |
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Snapshot builds appear to be broken, the mac artifact is missing from https://oss.sonatype.org/content/repositories/snapshots/org/tensorflow/tensorflow-core-api/0.4.0-SNAPSHOT/, and the CI is failing because of that. cc @karllessard |
Signed-off-by: Ryan Nett <[email protected]>
tensorflow-core/tensorflow-core-api/src/gen/annotations/org/tensorflow/op/Ops.java
Outdated
Show resolved
Hide resolved
tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Session.java
Outdated
Show resolved
Hide resolved
tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Session.java
Outdated
Show resolved
Hide resolved
tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Graph.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
@rnett I am trying to build on macOS, pulling
Any suggestions. |
@rnett does the Also can you re-init the variable later? I am looking at Metrics For example: Upon original init: I assume, this will cause variable Later on I want to reset I desire to have the |
Signed-off-by: Ryan Nett <[email protected]>
@JimClarke5 It should, although I'm not sure we want to purposefully introduce that kind of eager-like graph modification to session semantics. Imo it would be odd to intentionally have a Java function that's not a session run affect the session state, the "run all new initializers" was intended more as a failsafe. You could run into the thread safety issues w/ modifying a graph while the sessions is open, too. Ideally, I think the solution would be to have things like Metrics use Eager variables to store their state, but that makes managing them more complicated. |
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
I have totally refactored I have removed the Here is the main interface for
Once this PR is merged I will create another PR for these metrics' changes. My next goal is to rework |
I have finished converting Except for these failures in the overall build, I am good to go
I do notice that the exact values are different from when this error first appeared for me. |
Can you rebase on this branch? The latest commit here fixed those for me. |
@rnett I have pulled your latest code and all is good now. I am ok with this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, @Craigacp any additional comment before I merge this?
tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/Graph.java
Show resolved
Hide resolved
Apart from the NoOp issue that I just mentioned it all looks fine. I am worried about the level of change this is to the codebase and I can't quite see the reason for it, but that doesn't mean there isn't a good one. |
@Craigacp It's mostly to support functions, which are Graph's with Eager init scopes, which wasn't possible to implement using fake graph ops like we were using for init previously. |
According to @JimClarke5 comment, the improvements in the framework are substantial. Personally, I think if we can lift the burden of initialization from the users, it's a plus but saved models were already handling this for inference so changes are just impacting users that were doing their training in Java (@rnett, am I right here?) |
Yeah, the SavedModel handles it for Tf2 models. This also makes it a big easier to handle TF1 models and we can export our initializers easier (I'm going to make a new PR for those). |
This PR will be beneficial whenever a model is created by the developer, which includes all the major model actions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think I'm fine with this being merged.
I do wonder if we could get away with just tracking the init ops better as we own all of them without creating a privileged init scope that people could use incorrectly (and isn't type safe), but I don't understand the eager/graph function use case well enough to know if that would cause problems.
Thanks @rnett ! |
Reworks how initialization is handled (somewhat, mostly the API). I was originally planning on doing name-based for interop with tensorflow, but they don't do name based either so I kept the current method.
The API has changed: instead of adding init ops yourself, they will automatically be added when created with an init scope, which you get using
Ops.initScope()
orScope.initScope()
. This allowed for better error checking (i.e. init ops can't depend on non init ops), eliding control dependencies on them (since they are created at init time), but required makingScope
s part of OpBuilders (which was a good thing mostly, and prevents future "forgot to call apply" bugs). I also added currently unused methods to do initialization in a different execution environment, which will be used by functions.So for an example, creating a variable w/ an initial value becomes
tf.initScope().variable(tf.initScope().constant(4f))
, which automatically registers it for initialization by sessions.I also added helpers to
Session
to create and initialize and a requirement that if the graph has init ops, initialization must b ran before running anything else. I don't really like having separateinitialized
factory methods, thoughts on having it be initialized on construction by default but havingSession(Graph, boolean)
constructors where you can control it?I have yet to update
framework
to use this, does it work for what you were working on w/ variables @JimClarke5?The new generated op changes also aren't committed yet, for size reasons.