Skip to content

Errors when using the tensorflow or tensorflow-gpu packages #216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
marunguy opened this issue Jun 23, 2022 · 13 comments
Closed

Errors when using the tensorflow or tensorflow-gpu packages #216

marunguy opened this issue Jun 23, 2022 · 13 comments

Comments

@marunguy
Copy link

marunguy commented Jun 23, 2022

What is the recommended development environment?

  • my environment
    • windows 10 64bit
    • python 3.8.10 64bit
    • tensorflow 2.9.1
    • tensorflow-directml-plugin 0.0.1.dev220621
    • Intel Core i5-1135G7, Intel Iris Xe Graphics

An error occurs when running a simple example.

  • test code
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#---------------------------------------------------------------------------------------------------
"""..."""

import tensorflow as tf

tf.debugging.set_log_device_placement(True)
# tf.enable_eager_execution()

from tensorflow import keras
from tensorflow.keras import layers

#---------------------------------------------------------------------------------------------------
def ex() -> None:
    """..."""
    inputs = keras.Input(shape=(3, 32, 32))
    x = layers.Conv2D(32, (3, 3), activation="relu")(inputs)
    x = layers.Flatten()(x)
    x = layers.Dense(256, activation="relu")(x)
    outputs = layers.Dense(10, activation="softmax")(x)

    model = keras.Model(inputs, outputs)
    model.summary()
#---------------------------------------------------------------------------------------------------
def main() -> None:
    """..."""
    ex()
#---------------------------------------------------------------------------------------------------
if __name__ == "__main__":
    main()
  • error message
(tf2dml_38) d:\ai\ai_test>python e.py
2022-06-23 14:46:49.848519: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2022-06-23 14:46:49.848783: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-06-23 14:46:51.668324: I tensorflow/c/logging.cc:34] Successfully opened dynamic library D:\devtool\pyvenv\tf2dml_38\lib\site-packages\tensorflow-plugins/directml/directml.0de2b4431c6572ee74152a7ee0cd3fb1534e4a95.dll
2022-06-23 14:46:51.669120: I tensorflow/c/logging.cc:34] Successfully opened dynamic library dxgi.dll
2022-06-23 14:46:51.672098: I tensorflow/c/logging.cc:34] Successfully opened dynamic library d3d12.dll
2022-06-23 14:46:51.733125: I tensorflow/c/logging.cc:34] DirectML device enumeration: found 1 compatible adapters.
2022-06-23 14:46:52.042786: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-06-23 14:46:52.043678: I tensorflow/c/logging.cc:34] DirectML: creating device on adapter 0 (Intel(R) Iris(R) Xe Graphics)
2022-06-23 14:46:52.070285: I tensorflow/c/logging.cc:34] Successfully opened dynamic library Kernel32.dll
2022-06-23 14:46:52.071152: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2022-06-23 14:46:52.071254: W tensorflow/core/common_runtime/pluggable_device/pluggable_device_bfc_allocator.cc:28] Overriding allow_growth setting because force_memory_growth was requested by the device.
2022-06-23 14:46:52.071518: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6951 MB memory) -> physical PluggableDevice (device: 0, name: DML, pci bus id: <undefined>)
2022-06-23 14:46:52.074304: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 14:46:52.076556: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 14:46:52.076744: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Traceback (most recent call last):
  File "e.py", line 31, in <module>
    main()
  File "e.py", line 28, in main
    ex()
  File "e.py", line 18, in ex
    x = layers.Conv2D(32, (3, 3), activation="relu")(inputs)
  File "D:\devtool\pyvenv\tf2dml_38\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "D:\devtool\pyvenv\tf2dml_38\lib\site-packages\keras\backend.py", line 1920, in random_uniform
    return tf.random.uniform(
tensorflow.python.framework.errors_impl.InvalidArgumentError: Multiple OpKernel registrations match NodeDef at the same priority '{{node RandomUniform}}': 'op: "RandomUniform" device_type: "GPU" constraint { name: "T" allowed_values { list { type: DT_INT32 } } } constraint { name: "dtype" allowed_values { list { type: DT_FLOAT } } } host_memory_arg: "shape"' and 'op: "RandomUniform" device_type: "GPU" constraint { name: "T" allowed_values { list { type: DT_INT32 } } } constraint { name: "dtype" allowed_values { list { type: DT_FLOAT } } } host_memory_arg: "shape"' [Op:RandomUniform]
  • error message with tensorflow 2.8.0
(tf2dml_38) d:\ai\ai_test>python e.py
2022-06-23 14:50:16.448936: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2022-06-23 14:50:16.449160: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
  File "e.py", line 6, in <module>
    import tensorflow as tf
  File "D:\devtool\pyvenv\tf2dml_38\lib\site-packages\tensorflow\__init__.py", line 443, in <module>
    _ll.load_library(_plugin_dir)
  File "D:\devtool\pyvenv\tf2dml_38\lib\site-packages\tensorflow\python\framework\load_library.py", line 151, in load_library
    py_tf.TF_LoadLibrary(lib)
tensorflow.python.framework.errors_impl.NotFoundError: D:\devtool\pyvenv\tf2dml_38\lib\site-packages\tensorflow-plugins\tfdml_plugin.dll not found
@marunguy
Copy link
Author

marunguy commented Jun 23, 2022

When tensorflow-cpu package is installed instead of tensorflow, my test code works well.
I think the environment in which it works should be documented.

python -m pip uninstall -y tensorflow
python -m pip install -U tensorflow-cpu
(tf2dml_38) d:\ai\ai_test>python e.py
2022-06-23 15:45:04.414644: I tensorflow/c/logging.cc:34] Successfully opened dynamic library D:\devtool\pyvenv\tf2dml_38\lib\site-packages\tensorflow-plugins/directml/directml.0de2b4431c6572ee74152a7ee0cd3fb1534e4a95.dll
2022-06-23 15:45:04.415756: I tensorflow/c/logging.cc:34] Successfully opened dynamic library dxgi.dll
2022-06-23 15:45:04.418218: I tensorflow/c/logging.cc:34] Successfully opened dynamic library d3d12.dll
2022-06-23 15:45:04.478579: I tensorflow/c/logging.cc:34] DirectML device enumeration: found 1 compatible adapters.
2022-06-23 15:45:04.793222: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-06-23 15:45:04.794119: I tensorflow/c/logging.cc:34] DirectML: creating device on adapter 0 (Intel(R) Iris(R) Xe Graphics)
2022-06-23 15:45:04.820391: I tensorflow/c/logging.cc:34] Successfully opened dynamic library Kernel32.dll
2022-06-23 15:45:04.821300: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2022-06-23 15:45:04.821385: W tensorflow/core/common_runtime/pluggable_device/pluggable_device_bfc_allocator.cc:28] Overriding allow_growth setting because force_memory_growth was requested by the device.
2022-06-23 15:45:04.823647: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6951 MB memory) -> physical PluggableDevice (device: 0, name: DML, pci bus id: <undefined>)
2022-06-23 15:45:04.826201: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.828504: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.828694: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.829140: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.832519: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op Sub in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.833434: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op Mul in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.835200: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op AddV2 in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.835964: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.837891: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.838815: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.838980: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.839236: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op Fill in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.840324: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.840505: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.847835: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.848112: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.848338: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.848542: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.849310: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op Sub in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.850405: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op Mul in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.853836: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op AddV2 in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.855384: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.855644: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.856197: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.856320: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op Fill in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.856850: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.857350: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.861528: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.861711: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.862009: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.862336: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.863311: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op Sub in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.863599: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op Mul in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.864040: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op AddV2 in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.864486: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.864642: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.865170: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.865273: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op Fill in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.865728: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.865837: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.869567: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.869814: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.870028: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.870323: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.870434: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.870942: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-23 15:45:04.873005: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 input_1 (InputLayer)        [(None, 3, 32, 32)]       0

 conv2d (Conv2D)             (None, 1, 30, 32)         9248

 flatten (Flatten)           (None, 960)               0

 dense (Dense)               (None, 256)               246016

 dense_1 (Dense)             (None, 10)                2570

=================================================================
Total params: 257,834
Trainable params: 257,834
Non-trainable params: 0
_________________________________________________________________

@maggie1059
Copy link
Collaborator

Hi @marunguy, thanks for bringing this to our attention! We're looking into it and will get back to you with an update as soon as possible. We do intend to support the tensorflow package as well, but please continue to use the tensorflow-cpu package to avoid this issue for the time being.

@PatriceVignola PatriceVignola changed the title Recommended development environment? Errors when using the tensorflow or tensorflow-gpu packages Jul 11, 2022
@anammari
Copy link

anammari commented Jul 13, 2022

I'm getting a similar error to the above.

My environment:

  • Windows 10 64 bit ver 21H2
  • WSL 2 (Ubuntu 20.04.4 LTS)
  • Python 3.10.5 (in WSL 2).
  • tensorflow 2.9.0
  • tensorflow-directml-plugin 0.0.1.dev220621
  • Intel Core i7-7700HQ CPU
  • NVIDIA GeForce GTX 1060

My testing command:

python squeezenet.py --mode train --tb_profile --cifar10

My output:

2022-07-13 14:14:38.916979: I tensorflow/c/logging.cc:34] Successfully opened dynamic library libdirectml.0de2b4431c6572ee74152a7ee0cd3fb1534e4a95.so
2022-07-13 14:14:38.917148: I tensorflow/c/logging.cc:34] Successfully opened dynamic library libdxcore.so
2022-07-13 14:14:38.932751: I tensorflow/c/logging.cc:34] Successfully opened dynamic library libd3d12.so
2022-07-13 14:14:45.256169: I tensorflow/c/logging.cc:34] DirectML device enumeration: found 2 compatible adapters.
2022-07-13 14:14:45.256227: W tensorflow/c/logging.cc:37] More than one physical devices were found, but tensorflow-directml-plugin doesn't support multiple devices yet. The first available device in order of performance was selected by default. To select a different device, set the DML_VISIBLE_DEVICES environment variable to its index (e.g. DML_VISIBLE_DEVICES=1).
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
170498071/170498071 [==============================] - 421s 2us/step
2022-07-13 14:22:02.959902: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-13 14:22:02.973040: I tensorflow/c/logging.cc:34] DirectML: creating device on adapter 0 (NVIDIA GeForce GTX 1060)
2022-07-13 14:22:04.767394: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2022-07-13 14:22:04.767463: W tensorflow/core/common_runtime/pluggable_device/pluggable_device_bfc_allocator.cc:28] Overriding allow_growth setting because force_memory_growth was requested by the device.
2022-07-13 14:22:04.767512: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4966 MB memory) -> physical PluggableDevice (device: 0, name: DML, pci bus id: )
Traceback (most recent call last):
File "/mnt/c/Users/Ahmad/github/directml/DirectML/TensorFlow/TF2/squeezenet/squeezenet.py", line 161, in
main()
File "/mnt/c/Users/Ahmad/github/directml/DirectML/TensorFlow/TF2/squeezenet/squeezenet.py", line 125, in main
model = SqueezeNet_CIFAR()
File "/mnt/c/Users/Ahmad/github/directml/DirectML/TensorFlow/TF2/squeezenet/squeezenet.py", line 78, in init
super(SqueezeNet_CIFAR, self).init()
File "/mnt/c/Users/Ahmad/tfdml_plugin/lib/python3.10/site-packages/tensorflow/python/training/tracking/base.py", line 587, in _method_wrapper
result = method(self, *args, **kwargs)
File "/mnt/c/Users/Ahmad/tfdml_plugin/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/mnt/c/Users/Ahmad/tfdml_plugin/lib/python3.10/site-packages/tensorflow/python/framework/ops.py", line 7164, in raise_from_not_ok_status
raise core._status_to_exception(e) from None # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Multiple OpKernel registrations match NodeDef at the same priority '{{node AssignVariableOp}}': 'op: "AssignVariableOp" device_type: "GPU" constraint { name: "dtype" allowed_values { list { type: DT_INT64 } } } host_memory_arg: "resource"' and 'op: "AssignVariableOp" device_type: "GPU" constraint { name: "dtype" allowed_values { list { type: DT_INT64 } } } host_memory_arg: "resource"'
when instantiating AssignVariableOp [Op:AssignVariableOp]

@PatriceVignola
Copy link
Contributor

@anammari Please use tensorflow-cpu instead of tensorflow for the time being. They offer the same functionalities when using tensorflow-directml-plugin, but tensorflow contains a bug in 2.9 that we're working on fixing for TF 2.10: tensorflow/tensorflow#56707

@mehfuzh
Copy link

mehfuzh commented Jul 22, 2022

I am getting a similar error with directml plugin for tensorflow 2.9.1. I want to use GPU and not CPU, and want to run it in wsl in this way I don't have to do dual boot:

I am using the latest 2022 Dell XPS 15 with i9 and NVidia 3050 Ti

  File "/home/smartlp/miniconda3/envs/tf2.8/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 7164, in raise_from_not_ok_status
    raise core._status_to_exception(e) from None  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Multiple OpKernel registrations match NodeDef at the same priority '{{node AssignVariableOp}}': 'op: "AssignVariableOp" device_type: "GPU" constraint { name: "dtype" allowed_values { list { type: DT_INT64 } } } host_memory_arg: "resource"' and 'op: "AssignVariableOp" device_type: "GPU" constraint { name: "dtype" allowed_values { list { type: DT_INT64 } } } host_memory_arg: "resource"'
         when instantiating AssignVariableOp [Op:AssignVariableOp]

@PatriceVignola
Copy link
Contributor

@mehfuzh You can still use the GPU if you install tensorflow-directml-plugin along with the tensorflow-cpu package instead of tensorflow as a workaround until we get the fix merged in (tensorflow/tensorflow#56707). The -cpu suffix in the tensorflow package name doesn't mean that tensorflow-directml-plugin won't use your gpu, it just means that the CUDA/ROCM gpus won't be used. But as long as you have tensorflow-directml-plugin installed on top of it, you should be good to go! Let us know how it works for you.

pip uninstall tensorflow
pip install tensorflow-cpu
pip install tensorflow-directml-plugin

@mehfuzh
Copy link

mehfuzh commented Jul 22, 2022

@PatriceVignola I'll test it. But I don't want to use build intel Iris Xe GPU, I want to take advantage of the RTX GPU, which is mostly CUDA for cuDNN (afaik) libraries to advantage

@PatriceVignola
Copy link
Contributor

@mehfuzh tensorflow-directml-plugin will let you take advantage of your Nvidia GPUs (including tensor cores in certain scenarios), even if you're using it against tensorflow-cpu. The version of the underlying tensorflow package is irrelevant since tensorflow-directml-plugin will overwrite the GPU device to put its own DirectML based devices.

@mehfuzh
Copy link

mehfuzh commented Jul 23, 2022

@PatriceVignola This makes sense, However, after installing tensoflow-cpu it is gving me the following error (works fine with powershell + conda or even m1 ultra + pluggable device):

tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation sequential/embedding/embedding_lookup: Could not satisfy explicit device specification '' because the node {{colocation_node sequential/embedding/embedding_lookup}} was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:GPU:0].
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_name_index_=2 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]

@PatriceVignola
Copy link
Contributor

@mehfuzh Could you open another issue for this error and include the complete output? It would also help if you could provide a simple script that reproduces it.

@TomieAi
Copy link

TomieAi commented Sep 7, 2022

pip install tensorflow-cpu==2.9
works fine..

(tfdml_plugin) tomie@TomieNW:~/ai$ python test.py
2022-09-08 00:23:04.444358: I tensorflow/c/logging.cc:34] Successfully opened dynamic library libdirectml.0de2b4431c6572ee74152a7ee0cd3fb1534e4a95.so
2022-09-08 00:23:04.444429: I tensorflow/c/logging.cc:34] Successfully opened dynamic library libdxcore.so
2022-09-08 00:23:04.445555: I tensorflow/c/logging.cc:34] Successfully opened dynamic library libd3d12.so
2022-09-08 00:23:06.878862: I tensorflow/c/logging.cc:34] DirectML device enumeration: found 1 compatible adapters.
2022-09-08 00:23:07.208456: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-09-08 00:23:07.209431: I tensorflow/c/logging.cc:34] DirectML: creating device on adapter 0 (AMD Radeon RX 5700)

I have rx 5700 ... it says it's using that but why is my CPU temp ramping up instead of my GPU?
but then again maybe the data i have is not too complicated for GPU so CPU running it i dont know

i tried this tho

tf.debugging.set_log_device_placement(True)

# Place tensors on the CPU
with tf.device('/GPU:0'):
  a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
  b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])

# Run on the GPU
c = tf.matmul(a, b)
print(c)

result is

2022-09-08 00:39:04.278269: I tensorflow/c/logging.cc:34] Successfully opened dynamic library libdirectml.0de2b4431c6572ee74152a7ee0cd3fb1534e4a95.so
2022-09-08 00:39:04.278357: I tensorflow/c/logging.cc:34] Successfully opened dynamic library libdxcore.so
2022-09-08 00:39:04.279080: I tensorflow/c/logging.cc:34] Successfully opened dynamic library libd3d12.so
2022-09-08 00:39:06.682054: I tensorflow/c/logging.cc:34] DirectML device enumeration: found 1 compatible adapters.
2022-09-08 00:39:06.930454: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-09-08 00:39:06.931500: I tensorflow/c/logging.cc:34] DirectML: creating device on adapter 0 (AMD Radeon RX 5700)
2022-09-08 00:39:08.256615: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2022-09-08 00:39:08.256685: W tensorflow/core/common_runtime/pluggable_device/pluggable_device_bfc_allocator.cc:28] Overriding allow_growth setting because force_memory_growth was requested by the device.
2022-09-08 00:39:08.256737: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6939 MB memory) -> physical PluggableDevice (device: 0, name: DML, pci bus id: <undefined>)
2022-09-08 00:39:08.268485: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-09-08 00:39:08.268829: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-09-08 00:39:08.269389: I tensorflow/core/common_runtime/eager/execute.cc:1323] Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0
tf.Tensor(
[[22. 28.]
 [49. 64.]], shape=(2, 2), dtype=float32)

so it indeed working it is what it is xD I don't know what I'm doing I'm just playing. I'm following a series https://www.youtube.com/watch?v=z1PGJ9quPV8 for fun and educational purposes.

@PatriceVignola
Copy link
Contributor

PatriceVignola commented Sep 7, 2022

Like you said, this is a very simple model so the overhead of the CPU initializing TensorFlow and copying the tensors to/from the GPU outweigh the benefits of having a GPU. You should see the GPU activity ramping up once you start running bigger models ^^

@PatriceVignola
Copy link
Contributor

The TensorFlow team has no plan at this moment to allow plugins that define devices with the "GPU" string to be used together with tensorflow or tensorflow-gpu. So for the foreseeable future, only tensorflow-cpu should be used with tensorflow-directml-plugin.

We just released version 0.1.0.dev220928 which adds tensorflow-cpu to the required dependencies, so installing tensorflow-directml-plugin should now install the correct package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants