Skip to content

dev_container #2613

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fsx950223 opened this issue Nov 25, 2021 · 11 comments
Closed

dev_container #2613

fsx950223 opened this issue Nov 25, 2021 · 11 comments
Assignees
Labels
blocked Pending something elses completion

Comments

@fsx950223
Copy link
Member

fsx950223 commented Nov 25, 2021

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
  • TensorFlow version and how it was installed (source or binary):
  • TensorFlow-Addons version and how it was installed (source or binary):
  • Python version:
  • Is GPU used? (yes/no):
    no
    Describe the bug
    Why tfaddons/dev_container:latest-cpu is so big(6.3GB) and has some CUDA layers and lots of apt-update layers which increase image size.
    A clear and concise description of what the bug is.

Code to reproduce the issue

Create a codespace.

Other info / logs

https://hub.docker.com/layers/tfaddons/dev_container/latest-cpu/images/sha256-e97c0a51c9da13134b9e4f2a27aeee662def8e77ced84224f4dcd90e00cc18d3?context=explore

2021-11-25T06:01:15: [2021-11-25T06:01:15.292Z] failed to register layer: ApplyLayer exit status 1 stdout:  stderr: write /usr/local/lib/python3.9/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so: no space left on device
Error: Command failed: docker build -f /var/lib/docker/codespacemount/workspace/addons/.devcontainer/Dockerfile -t vsc-addons-7dc239d633fc90a0907165f6f5d2c6fb /var/lib/docker/codespacemount/workspace/addons/.devcontainer
    at A7 (/usr/lib/node_modules/@microsoft/vscode-dev-containers-cli/dist/node/devContainersCLI.js:318:7786)
    at async T7 (/usr/lib/node_modules/@microsoft/vscode-dev-containers-cli/dist/node/devContainersCLI.js:318:6090)
    at async pF (/usr/lib/node_modules/@microsoft/vscode-dev-containers-cli/dist/node/devContainersCLI.js:318:2407)
    at async o7 (/usr/lib/node_modules/@microsoft/vscode-dev-containers-cli/dist/node/devContainersCLI.js:312:10911)
    at async n3 (/usr/lib/node_modules/@microsoft/vscode-dev-containers-cli/dist/node/devContainersCLI.js:344:3255)
    at async dae (/usr/lib/node_modules/@microsoft/vscode-dev-containers-cli/dist/node/devContainersCLI.js:344:22780)
    at async fae (/usr/lib/node_modules/@microsoft/vscode-dev-containers-cli/dist/node/devContainersCLI.js:344:22376)

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

@bhack
Copy link
Contributor

bhack commented Nov 25, 2021

Yes with free/eval Codespaces the disk space Is limited but the main issues are:

#2598 (comment)

And
#2515

https://discuss.tensorflow.org/t/adopting-open-source-dockerfiles-for-official-tf-nightly-ci/6050/4

/cc @seanpmorgan

@fsx950223
Copy link
Member Author

IMO, codespace container should use a different image instead of devops image.

@fsx950223
Copy link
Member Author

Why we need multipython in codespace?

@bhack
Copy link
Contributor

bhack commented Nov 25, 2021

The point is to have the same developer container as the one we are using in the CI so that we are almost on the same page when we develop TF Addons and when we automatically validate it with the CI without having too much risks to be out of sync between the two envs as It Is seems that this type of drift happens quite often, soon or later, when you have two independent envs.

But now we don't have any CPU image anymore also in the new TF Docker refactoring effort.

.devcontainer is not only about Codespaces but also about development in Vscode with dev own resources and this is why I've put some commented lines to enable GPU options but we don't have a valid upstream CPU image and custom ops TF images are unmaintaiend (see the mentioned forum thread).

@fsx950223
Copy link
Member Author

fsx950223 commented Nov 25, 2021

Could we specific different .devcontainer for different envs? We could separate latest-cpu and latest-gpu docker image.
We could run CI/CD without .devcontainer before add it.

@bhack
Copy link
Contributor

bhack commented Nov 25, 2021

We could separate latest-cpu and latest-gpu docker images. If you see the image type was already an arg controlled by .devcontainer:

ARG IMAGE_TYPE=latest-cpu

The problem is that the image on our (Addons) DockerHUB registry is de-facto a GPU one after #2598 (comment) was merged.

I've prepared an upstream PR to start to separate baseline (CPU) and CUDA layers:
tensorflow/build#47

We still need to work with comments in the same single .devcontainer as Vscode/Codespace still really works with the default .devcontainer.

See more at:
microsoft/vscode-remote-release#1165
microsoft/vscode-remote-release#3279

@bhack
Copy link
Contributor

bhack commented Nov 29, 2021

TF doesn't want to accept the contribution of an intermediate CPU target based with a small refactoring of their own new receipt tensorflow/build#47 (comment).

So when we are going to merge @seanpmorgan (and mine) #2515 we still have all the CUDA layer overhead.

I will accept any suggestion but on my side I don't want to maintain multiple Dockerfile diverging receipts between the Addons devel env and the Addons CI.

@bhack
Copy link
Contributor

bhack commented Dec 1, 2021

/cc @yarri-oss

@seanpmorgan
Copy link
Member

Discussed this in the grooming meeting. It's certainly something we want supported for Addons, but we're not willing to build our own containers given that custom-op image is no longer supported. Lets bring this up at the next SIG build meeting to see if we can get any traction.

@seanpmorgan seanpmorgan added the blocked Pending something elses completion label Dec 16, 2021
@bhack
Copy link
Contributor

bhack commented Jan 11, 2022

We discussed this in today meeting SIG BUILD meeting but It seems that tensorflow/build#47 (comment) review could not go ahead.

@seanpmorgan
Copy link
Member

TensorFlow Addons is transitioning to a minimal maintenance and release mode. New features will not be added to this repository. For more information, please see our public messaging on this decision:
TensorFlow Addons Wind Down

Please consider sending feature requests / contributions to other repositories in the TF community with a similar charters to TFA:
Keras
Keras-CV
Keras-NLP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked Pending something elses completion
Projects
None yet
Development

No branches or pull requests

3 participants