-
Notifications
You must be signed in to change notification settings - Fork 214
Move op generation to Java #244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Not sure what's up with the Javadoc generation failing on the CI. I saw it a few times locally, but a clean fixed it. Update: seeing the same thing when I build on WSL. The artifact builds fine, but it errors with javadoc issues when building the whole project. |
There's also a bunch of ops that start with |
For people working on windows, this can be run (correctly) via WSL. You need to get the api defs from somewhere, but a clone of I set up the exec plugin to allow for this, although since it doesn't use the working dir when running java the command gets a bit long. I'm currently using something like |
Alright, I fixed the build issues (locally, at least), they are only present when including |
Since we don't need javadocs for the generator, can we skip them? Or will that cause issues with central? I don't think it would be published with the way it is configured atm, and we certainly don't need it to be.
|
BTW, TF Core needs MSYS2 to build, it doesn't work with WSL... |
Javadoc isn't just for users, it's for us in 6 months time. I think all new code in this repo should have it whether we expect people to use the class or not (excluding test code). |
Yeah, that's one of the advantages of doing the op gen in Java, it works fine with |
Oh yeah, the code will have it. I just mean building the javadoc jars. |
Right, but if we need WSL, it's going to cause problems for CI. I imagine this can eventually be adjusted to work with MSYS2 as well. Just wanted to point it out for now. |
Well, it turns out it works fine on Windows if I set the api docs dir correctly... The larger issue here is that it depends on |
Looks like something is currently published - https://search.maven.org/artifact/org.tensorflow/tensorflow-core-generator/0.2.0/jar, and so if we still want to publish it then it will need javadoc and sources jars. Does this only happen on Windows? I've not had chance to look through the PR yet, is there an error message somewhere from javadoc I could look at? |
Oh? Weird.
That's not an issue, we just need to run the build with a specific platform like |
Huh, that's the annotation processor but if it's published this will be too. It doesn't happen locally for me, just in the CI (quick build is on linux). I suspect what is happening is that |
There's one or two differences, but I think it's just because I'm building off a cloned tensorflow instead of the downloaded source archive.
I assume that's done by the CI. My issue is more that the op generation is done in |
@Craigacp This may be because the newly generated op javadocs are slightly different, but I'm pretty sure this happens on |
It looks like a lot of the changes are because the |
That's due to the way JavaPoet handles parameter javadocs. I should be able to work around it, but it may not be pretty. |
I have a better solution to the two phase builds. We can still use the native code for op detection, but output the OpDef and ApiDef protos to the Java generator to do the generation. I'd like to it via pipe but I need to look into protobuf a little more. The only thing is that we need to include the proto defs in the generator. Since they are auto-generated anyways I may just be able to add the dir as a source file, otherwise we can either copy them or extract them to another module. Any preferences there? |
We could potentially just bundle them in the JAR file. It's really easy to
bundle and extract things like that as "native resources" with JavaCPP.
Check what it looks like for files from CPython here:
https://github.com/bytedeco/javacpp-presets/blob/master/cpython/src/main/java/org/bytedeco/cpython/presets/python.java#L192
https://github.com/bytedeco/javacpp-presets/blob/master/cpython/src/main/java/org/bytedeco/cpython/presets/python.java#L213
|
Thanks @rnett , the size of that PR is quite impressive and would take time to review. I remember in the days I've written the C++ version, my Google reviewer asked me to break it down into smaller pieces and I think I understand why ;) Any chance you can do the same? On a different note, here's just a few thoughts:
|
Yeah, I can leave the actual generation for another PR, and just set it up so that we could use it in this one. I'll leave it in for now so you can see what the generated code looks like, and pull it out sometime in the next few days. Depending on how you do the review, I've tried (and I think been successful at) keeping the generated code in separate commits, so you can always just drop them locally.
It should be. The reason I didn't is because I was using
It's mostly just maintenance. I know @JimClarke5 mentioned writing a better markdown converter. The original reason I did it is I was running into segfaults when trying to add function generation, but it turns out that's because I was running the wrong thing. I also want to try some more complicated stuff with functions. What exactly that looks like depends on what we do with the function API, but ideally I'd like to be able to pass lambdas to ops that take functions. There's also capture handling, where ops like
I'll update this PR as I do this, but my current approach is doing something like @saudet mentioned, and creating what is essentially our own I need access to the protos either way, do you have any preference between:
|
I cannot figure why we would prefer to build our own
Ideally my preference would be 5. :) which is to run the Bazel build at the In other words, I'm dreaming since the first day of a file structure like this:
When invoking the Note that we can still do all of the above while keeping the actual file structure, that would work too, but I always got that feeling that the native libraries/files resulting from the Bazel build should be accessible to more than one module ( |
All files are available to all other modules with the approach I'm talking about above with CPython. What's wrong with packaging those development files in JARs? |
I'm not saying it's wrong but I would prefer avoiding having an additional JAR/module like The |
A couple things:
If you have a solution for providing the native library in dev mode this changes, but my thinking was if we need to produce an ApiDef map using the native library anyways, we might as well just add the op defs rather than trying to get bazel to fetch I was thinking we'd probably end up with two generator modules. One for native builds that doesn't depend on |
<<END is called a here document and a similar syntax is also used in Unix Shell. Also, I have found an open source Java project to parse Markdown similar to the python code I was using. commonmark-java . This should make it easier to covert the Markdown in the apidef to JavaDoc. |
It doesn't need to be a plugin, but in any case, we can already call TF_GetAllOpList() from Java, yes, that's not a problem:
We don't need JavaCPP, but it's 1 line to copy all resources at build time, and 1 line to load them in the cache. There are other libraries that do that too, it can be (re)done manually too, sure, but what's the point.
It doesn't need to be in the same JAR file. We can have a "dev" JAR where we just dump everything that wouldn't be needed by end users, just like Linux distributions do it. That way, the JAR file can also be downloaded with Maven automatically by developers that don't need to rebuild TF Core without having to deal with Bazel at all. With Maven profiles, the same artifact can have different dependencies, that's not a problem either. |
@rnett ok here's another idea then, why not bundling the Full build:
Dev build:
This should be quite simple to achieve and is very close that we are already doing, without the need of reshuffling the modules at all. It would have been interesting too to be able to call the JNI code that is already part of the |
…lt text. Add missing description and @return for Getters/Setters Add fixes for missing @param's and Generic Types Reformat code. Signed-off-by: Ryan Nett <[email protected]>
Fix handling of oddities from Python, like illegal html tag <bytes>
Signed-off-by: Ryan Nett <[email protected]>
Fix oddities from TF Python like @compatibility, @end_compatibility, @tf.xxxxx Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
I made a separate PR with all the generated op changes: rnett#3 |
The memory leak test is failing @karllessard @saudet, it's failing on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job @rnett and @JimClarke5 ! We can merge that one now (unless @rnett you want to add that comment before) but before enabling this Java generator, I would like to configure the lint checks (as I was suppose to do in the last two years...) to double-check that the code generated with it is still compliant (like the actual one generated by the C++ generator is).
Which platform? |
That's on the latest ubuntu container in GitHub Actions, check config for the quick-build |
Seems a bit flaky on all platforms actually yes. I'll try to make it more reliable... |
@karllessard yeah I'll add the comment. There's a slight issue wrt running the generator during the |
Also, what do you think about only automatically running the generator when |
Signed-off-by: Ryan Nett <[email protected]>
Oops, sorry, I’ve noticed your comments just after merging! But that’s fine, it’s the normal way Maven works with interdependencies inside the same project: modules added as dependencies to others needs to be build/installed first and that’s what we do.
I will also leave it like that, it might be counterintuitive that you need to run it manually when you’re in dev mode. Is there any value for not running it? |
Only speed, really, and not having to worry about any nondeterminism or extra changes (like with the annotation processor). It's pretty fast, but I'm not sure if it will stay that way if we have to format it. Probably not worth pulling it out though. |
yeah so let’s be reactive and only readjust if needed 👍 |
This is the start of moving the op class generators to Java. It works fine on Linux, which is where I'm assuming the old op generator is normally run.
However, there are some output differences on Windows:
BroadcastArgs
with the class nameBroadcastDynamicShape
, andAll
asReduceAll
. The class name comes from theApiDef
endpoints on both generators, I am unsure why they differ.I don't think there's really anything we can do about this at the moment, it's just a bit odd to have a java program that you can't run on all platforms.