-
Notifications
You must be signed in to change notification settings - Fork 214
Upgrade for TensorFlow 2.4.1 #212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks a lot @saudet , let's prioritize this merge and leave 2.3.1 broken for now. I'll pull your branch and see what I can do to fix the string issues + classify the new ops |
Finally, while I'm building your branch, let's give it a try on 2.3.1... 8b36d7e |
|
||
// #include "tensorflow/core/platform/ctstring_internal.h" | ||
|
||
// Initialize a new tstring. This must be called before using any function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any clue why the documentation of these new string endpoints get all messed up by being aggregated together?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TF_TString type isn't defined in ctstring.h, so we need to include ctstring_internal.h as well, which contains the definitions of those functions too. JavaCPP simply skips the redundant declarations that follow. We can hack something up, but this is something they should fix upstream, and eventually will I assume, so probably not something worth spending time on here.
if ("size" == output.var().name()) { | ||
// reserved name | ||
continue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not skip the getter of an output that have a reserved name but just rename it in the API def, though I'm not sure why size
would cause any problem here. Can you give more context to this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are new ops with outputs named "size". Do whatever you want with them, I just did that to get them to compile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll take care of it
Everything builds and all tests pass, so I believe this is ready to merge, unless we should refine all changes to the ops here as well? Also, the GPU builds on Linux are timing out: |
I'm actually working on classifying the new ops, which should be completed before merging. I'll make a PR to your branch probably today. I'm almost done, I just hit a backward-compatibility issue with some of them which I'll duplicate for now but that's something I'd like to bring as a topic for our next session to decide how to handle this properly next time. |
All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the ℹ️ Googlers: Go here for more info. |
Just pushed the changes, let's see how it goes with the build: https://github.com/tensorflow/java/actions/runs/589500056 |
@googlebot I consent |
This draft PR includes initial changes required just to get the builds going with TF 2.4.1. Some tests are crashing, probably because we still need to update the handling of
TF_String
as per the breaking change below, but there's probably a bit more to be done.https://github.com/tensorflow/tensorflow/releases/tag/v2.4.0
It's quite a big release, and a lot of ops have been added, so even after getting everything fixed up, there's probably more to be done before we can get this merged. @karllessard Please take a look and feel free to push any changes to my fork and/or merge this into a branch of the main repository.
On the flip side, most toolchain workarounds have been moved from
configure.py
to.bazelrc
(so we can clean upbuild.sh
a little), we finally have support for CUDA 11 and cuDNN 8, MKL-DNN 1.x now works on all platforms (Linux, Mac, and Windows) without huge external binaries, and I was able to find what the issue was with CUDA on CentOS. Apparently, it doesn't like its new version of glibc, and we can prevent it from getting installed by callingyum
this way:https://github.com/saudet/tensorflow-java/blob/upgrade-tensorflow-241/.github/workflows/ci.yml#L53-L58
@karllessard Feel free to merge this bit right away for TF 2.3.1!