Skip to content

Refactor modules and integrate JavaCPP to map the C API #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Oct 4, 2019

Conversation

saudet
Copy link
Contributor

@saudet saudet commented Sep 26, 2019

This basically integrates JavaCPP in the way discussed previously at karllessard#2, in addition to building with MKL enabled and upgrading to TF 2.0.0-rc2. (Maybe src/gen/java is too much? It should change only little by little from now on though.)

I also took the liberty to refactor tentatively the modules a bit, especially the naming. The module that launches Bazel is "tensorflow-core-api", whose parent is "tensorflow-core", so we don't end up with something odd like "parent-core", which also doesn't match with the subdirectory name. Other modules in "tensorflow-core" also use it as prefix, "tensorflow-core-platform" and "tensorflow-core-examples", which contains the LabelImage example that I moved there and a HelloWorld one for the C API. I've also moved the single source file from "annotation-processor" to "tensorflow-core-api" since we can execute all that in the same module. What do you think about that? It's not supposed to be used for anything else, right?

This naming scheme also works for others: "tensorflow-utils" -> "tensorflow-utils-nio", "tensorflow-frameworks" -> "tensorflow-frameworks-keras", but we can still deviate a bit and do something like "tensorflow-keras". (BTW, I think we should use a different name than "keras", like "kerasj", for example.) If people follow even approximately these rules, I think we should get a pretty nice and consistent repository such as the one for Spring Boot: https://github.com/spring-projects/spring-boot

The project builds and runs fine on Linux and Mac here, but I still get a weird compiler error on Windows. I'll eventually get it working, but other than that I think it's ready for a review! (BTW, I had to copy .bazelrc in there. Is there a better way to achieve this?)

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@saudet
Copy link
Contributor Author

saudet commented Sep 26, 2019

@googlebot I signed it!

@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

@karllessard
Copy link
Collaborator

karllessard commented Sep 28, 2019

I also took the liberty to refactor tentatively the modules a bit, especially the naming. The module that launches Bazel is "tensorflow-core-api", whose parent is "tensorflow-core", so we don't end up with something odd like "parent-core", which also doesn't match with the subdirectory name. Other modules in "tensorflow-core" also use it as prefix, "tensorflow-core-platform" and "tensorflow-core-examples", which contains the LabelImage example that I moved there and a HelloWorld one for the C API.

I like those new names as well. But I would remove completely the tensorflow-core-examples folder, I was planning to move those samples to our second repo (github.com/tensorflow/java-models).

I've also moved the single source file from "annotation-processor" to "tensorflow-core-api" since we can execute all that in the same module. What do you think about that? It's not supposed to be used for anything else, right?

Like we discussed on the call, I'm pretty sure the annotation processor needs to be in its own artifact because we should depend on it when building the tensorflow-core-api jar. See how it is done here and here. Let me know you find another way to make it work, but I suggest that we run that processor outside Bazel even in that first phase.

(BTW, I had to copy .bazelrc in there. Is there a better way to achieve this?)

Looks like TensorFlow Serving is doing the same so I guess it is OK?

fi

# Build TensorFlow itself
bazel build --python_path="$PYTHON_BIN_PATH" --config=mkl --output_filter=DONT_MATCH_ANYTHING --verbose_failures @org_tensorflow//tensorflow:tensorflow @org_tensorflow//tensorflow/java:tensorflow
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we build only targets that we actually need for the Java part? We don't have to build the whole client, afaik we just need the operation wrappers, which are provided by this target //tensorflow/java:java_op_gen_sources, in replacement to //tensorflow/java:tensorflow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we can't, unfortunately, that is, not without patching TensorFlow, those targets are not public. Though, I've had to add a patch for MKL, since the fix isn't in 2.0.0. We can add more patches if we want to go that way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think we can have @sjamesr commit this very quickly in the main repository, I'll check with him but not super important neither, we don't have to wait for this.

<artifactId>javapoet</artifactId>
<version>1.11.1</version>
<optional>true</optional> <!-- for compilation only -->
</dependency>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

javapoet and guava are just required for the annotation-processor, which should be move to its own artifact (unless you find a way to make it work where you put it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, it's working just fine, yes. If we uncomment the lines that start with <!-- Bazel currently runs this, it gets executed just fine. However, it outputs an error because it doesn't want to overwrite existing classes:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.0:compile (default-compile) on project tensorflow-core-api: Fatal error compiling: java.lang.AssertionError: javax.annotation.processing.FilerException: Attempt to recreate a file for type org.tensorflow.op.NnOps -> [Help 1]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's because you copied those files (NnOps & friends) in your bash script previously, if you remove that last line in the script I pointed out, it should be fine... so you are saying that the annotation-processor runs successfully directly from the core-api jar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct. If I delete src/gen/java, it ends up generating them in target/generated-sources/annotations/. Is there a way to change that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done! But I haven't been able to get it to output to src/gen/java. I tried to hack it a dozen ways, but maven-compiler-plugin is hard coded to not compile anything from the generatedSourcesDirectory directory, which makes sense I guess, but it would be nice if we could use the includes and excludes filters instead. Anyway, it's now outputting to src/gen/annotations. Maybe we can do the same with the others like src/gen/javacpp and src/gen/ops...

public void delete() {
deallocate();
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this class (and all other classes found in src/main/java/org/tensorflow/c_api) actually used by your PR? If not, I would prefer we wait for the second phase before adding them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, they are the base classes of TF_Buffer, etc. They are functional, and we can modify them later as well. What are you worried about?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just don't find any additional value for having classes in the code base that are actually not used, it creates unnecessary confusion and they can be easily reviewed and added later as part of the whole migration of the JNI code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But they are used. Why do you say that they are not?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, sorry I didn’t read carefully that last reply of yours, all good then.

So generated classes can extend from human-written ones? Just for my knowledge, where this magic is happening?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<configuration>
<skip>${javacpp.parser.skip}</skip>
<outputDirectory>${project.basedir}/src/gen/java</outputDirectory>
<classOrPackageName>org.tensorflow.c_api.presets.*</classOrPackageName>
Copy link
Collaborator

@karllessard karllessard Sep 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TF Java code is based upon the Google Java Style Guide, which states that package names shouldn't have underscores in it...

Can we rename the c_api package to something else, like nativeapi?

(BTW would be great that generated class names also stick to camelcase, with no underscores, but I guess that could be a challenge to change this, right? e.g TFEContext instead of TFE_Context, or DeallocatorPointerLongPointer instead of Deallocator_Pointer_long_Pointer)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we can start customizing names like that yes, but the idea with JavaCPP is to map the underlying API to something that's as close as possible to the native API. If we start renaming things randomly like that, it increases the chances of conflicts and it makes it harder to understand the mapping. Besides, the C API is very hard to use, it's not meant to be used by end users, and making it even harder to use isn't going to help. :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the classes won't be public, I'm ok to just rename the package as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought was that having a package name like c_api makes it less likely for the casual to want to use it. I also thought about capi, but I just don't find it as obvious as c_api, which is used by TensorFlow in the docs making it clearer that it's related: https://www.tensorflow.org/install/lang_c

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe let's get the opinion of at least one other person? In a lot of places where JavaCPP just uses _, I know Panama uses $ because, you know, $ isn't a valid character in C/C++ identifiers, but it is in Java!! (I wonder how they are going to get this passed the CSR group...) @Craigacp Any opinions about this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for internal package names which we won't expose to the user then _ is fine. Panama uses $ for internally generated things because it guarantees no conflicts, but in this case the header file is actually called c_api.h, so mirroring the name is fine. Panama's jextract gives it the same name (see here https://hg.openjdk.java.net/panama/dev/raw-file/c359a9e944de/doc/panama_foreign.html#using-tensorflow-c-api-mac-os).

@karllessard
Copy link
Collaborator

(Maybe src/gen/java is too much? It should change only little by little from now on though.)

Yeah, I'm still not sure what we should do with those generated files neither... not having them committed though is pretty annoying when you want to write unit tests that uses them, for example... let's keep it there and agree that we might change our mind later.

@karllessard
Copy link
Collaborator

karllessard commented Sep 28, 2019

An idea for the annotation processor artifact: when we'll migrate the Java operators generator from C++ to Java, we will probably move the code in the same artifact as this processor. So we can give it a more generic name, like tensorflow-op-processor or anything better you may find

@saudet
Copy link
Contributor Author

saudet commented Sep 29, 2019

An idea for the annotation processor artifact: when we'll migrate the Java operators generator from C++ to Java, we will probably move the code in the same artifact as this processor. So we can give it a more generic name, like tensorflow-op-processor or anything better you may find

Good point. We can make the name even more generic too, like tensorflow-core-generator to stuff there anything we don't need at runtime.

@SidneyLann
Copy link

The java TF 2.0 available now? have more than 90% functionalities of TF 2.0?

@karllessard
Copy link
Collaborator

Good job @saudet , we are almost good to go, just want to follow up with you about my previous suggestion to remove completely tensorflow-core-examples from this repo.

I think it could be nice to have all the sample code in the java-models repo instead, to replicate what the core team is doing (sample code is found in /tensorflow/models instead of /tensorflow/tensorflow).

Otherwise, everything looks fine to me

@karllessard
Copy link
Collaborator

The java TF 2.0 available now? have more than 90% functionalities of TF 2.0?

@SidneyLann, I think the closest TF Java is to TF 2.0 is that it supports eager execution. Work is currently in progress to add a Keras-like API on top of it and and to handle functional graphs.

@SidneyLann
Copy link

The java TF 2.0 available now? have more than 90% functionalities of TF 2.0?

@SidneyLann, I think the closest TF Java is to TF 2.0 is that it supports eager execution. Work is currently in progress to add a Keras-like API on top of it and and to handle functional graphs.

https://github.com/tensorflow/swift/blob/master/docs/WhySwiftForTensorFlow.md, this link said java can't be used for Graph Program Extraction because java can't do static analysis, need to use swift, can you talk about this problem?

@Craigacp
Copy link
Collaborator

Craigacp commented Oct 2, 2019

@SidneyLann Those are different issues. Functional graphs as implemented in TF2 in python are different from extracting a graph program directly from the Swift program. As I understand it the Swift one is more precise, whereas the Python one requires tracing the annotated functions to determine the currently executing codepath. It's possible to do the tracing in Java, that's kinda how eager mode works for backprop.

Java's type system isn't quite powerful enough to express the kind of generics you'd like to use for static compile time typing of tensors (including their shapes), but to get a type system that powerful requires lots of other features which make it harder to use and can lead to a lot of complexity. I'm not sure if Swift's is powerful enough for that, but the Swift for Tensorflow project encompasses changes to the Swift language and core libraries, which allows you to do more than is possible in a normal library.

Also, this is pretty off topic for this pull request, if you want to talk about TF on the JVM you can post to the Tensorflow JVM SIG Google Group, or go to the Gitter (https://gitter.im/tensorflow/sig-jvm).

@saudet
Copy link
Contributor Author

saudet commented Oct 2, 2019

Good job @saudet , we are almost good to go, just want to follow up with you about my previous suggestion to remove completely tensorflow-core-examples from this repo.

I think it could be nice to have all the sample code in the java-models repo instead, to replicate what the core team is doing (sample code is found in /tensorflow/models instead of /tensorflow/tensorflow).

Otherwise, everything looks fine to me

Ah, I didn't realize you wanted to remove everything even my little "HelloWorld" :) I was thinking we could leave snippets and maybe even some documentation there for "core developers". Do you see a better place for those? They wouldn't go in another repo... Maybe the wiki? It doesn't appear to be enabled on TensorFlow repos though.

@karllessard
Copy link
Collaborator

@saudet, I am discussing this with @tzolov, we’ll definitely have a page for developers, need to figure out where, but let’s keep that topic pending for now and remove this example artifact from the current PR, I don’t think your “Hello World” will be missed anyway :D

But if you feel sentimental about it, maybe you could move it as an integration test in the src/test folder of the core-api until we figure this out?

@karllessard
Copy link
Collaborator

... at the same time, it’s up to you, we can keep it there and move it later as well, let me know.

@saudet
Copy link
Contributor Author

saudet commented Oct 3, 2019

Yes, unit tests as documentation, that works too. Ok, it's done!

@saudet saudet changed the title [WIP] Refactor modules and integrate JavaCPP to map the C API Refactor modules and integrate JavaCPP to map the C API Oct 3, 2019
@karllessard karllessard merged commit 9eb0357 into tensorflow:master Oct 4, 2019
karllessard added a commit that referenced this pull request Apr 28, 2020
karllessard pushed a commit that referenced this pull request Oct 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants