TensorFlow eager-mode VS PyTorch eager-mode #49229

innat · 2021-05-17T11:07:25Z

Config:

OS: Windows 10
TensorFlow 2.4.1
Torch 1.7.1

Query

We know that eager mode is slow compared to graph mode in TF 2.x. But how much it can be slow compared to PyTorch's eager mode??

A question was asked in SO regarding this, where the OP used a deep reinforcement learning code example with a custom training loop to compare. In that example, whereas a pytorch code takes approximately ~3 minutes to complete; and with the same training pipeline a tf code takes approximately ~2 hours to complete, even with less accuracy comparatively.

It probably also brings some other stuff like memory leaks during custom training loops etc. When I run the pytorch code, the CPU gets uses 100% and the 3D thread of GPU (RTX 2070) was using approximately 20%. But when I run tf code, the CPU gets uses ~50%, physical RAM gets increased over time (possible memory leaks), and VRAM gets super high and no use of 3D thread of GPU. Not sure what's the root cause.

The only and significant difference occurs after optimizing the tf code and compile it with graph execution, as demonstrated in the accepted answer. The answer is fine but it seems more like the way to optimize tf code.

I'm wondering, let's say we need to run tf code in eager mode, in that point, what is the root cause of this performance and execution gap between tf and pytorch. Is it expected behavior? The OP shared the plug-n-play code example, please find them from here.

The text was updated successfully, but these errors were encountered:

innat · 2021-06-02T04:13:38Z

@rmothukuru please let me know if I need to add any more information. The code (mention above) is plug-n-play type code, you don't need to bother with unnecessary libraries or anything serious basically.

innat · 2021-11-25T12:42:57Z

@rmothukuru
any update?

innat · 2022-03-07T13:04:56Z

@rmothukuru
Could you please give some feedback?

innat · 2022-06-12T04:50:02Z

It's been 1 year but still no response. To accelerate it, I've prepared a gist file to quickly execute the program. Please find the gist here. Note that, when I reported the anomaly, it was tf 2.4. And now, it upgrades to tf 2.8 or tf 29. But the behavior is still the same. Thanks.

TL, DR, same code but pytorch takes a minute to finish whereas tf takes hours. But executing the tensorflow as a graph mode can improve its execution time (details). And so the title goes, tf eager mode vs pytorch eager mode.

mohantym · 2023-01-26T11:40:08Z

Hi @innat!

I see a huge difference between timings of eager mode of Tensorflow/Pytorch now in 2.10 and 2.11.
Could you give us an update from your side.

Thank you!

innat · 2023-01-28T17:18:38Z

@mohantym Could you please provide quantitative results of execution time that you found. I still observe the issue.

innat · 2023-01-29T12:41:39Z

Also, if you think tf version with 2.10 or 2.11 fixe the issue that was reported during tf 2.4 (and till 2.9), please redirect me with relevant PR that fixes the unknown issue that cause such dramatic performance drop.

innat · 2023-08-04T18:51:09Z

PyTorch is not only faster but also more efficient than TensorFlow in an eager mode setup. Cool!

google-ml-butler · 2023-08-04T18:51:11Z

Are you satisfied with the resolution of your issue?
Yes
No

innat added the type:performance Performance Issue label May 17, 2021

google-ml-butler bot assigned saikumarchalla May 17, 2021

saikumarchalla added TF 2.4 for issues related to TF 2.4 comp:eager Eager related issues labels May 18, 2021

saikumarchalla assigned rmothukuru May 18, 2021

rmothukuru unassigned saikumarchalla May 18, 2021

rmothukuru added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label May 18, 2021

innat mentioned this issue Jan 20, 2023

RuntimeError: merge_call called while defining a new graph or a tf.function. keras-team/tf-keras#301

Closed

mohantym self-assigned this Jan 24, 2023

mohantym unassigned rmothukuru Jan 26, 2023

mohantym added stat:awaiting response Status - Awaiting response from author and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Jan 26, 2023

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Jan 28, 2023

innat closed this as completed Aug 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TensorFlow eager-mode VS PyTorch eager-mode #49229

TensorFlow eager-mode VS PyTorch eager-mode #49229

innat commented May 17, 2021 •

edited

Loading

innat commented Jun 2, 2021

Uh oh!

innat commented Nov 25, 2021

Uh oh!

innat commented Mar 7, 2022

Uh oh!

innat commented Jun 12, 2022

Uh oh!

mohantym commented Jan 26, 2023

Uh oh!

innat commented Jan 28, 2023

Uh oh!

innat commented Jan 29, 2023

Uh oh!

innat commented Aug 4, 2023

Uh oh!

google-ml-butler bot commented Aug 4, 2023

Uh oh!

TensorFlow eager-mode VS PyTorch eager-mode #49229

TensorFlow eager-mode VS PyTorch eager-mode #49229

Comments

innat commented May 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Config:

Query

innat commented Jun 2, 2021

Uh oh!

innat commented Nov 25, 2021

Uh oh!

innat commented Mar 7, 2022

Uh oh!

innat commented Jun 12, 2022

Uh oh!

mohantym commented Jan 26, 2023

Uh oh!

innat commented Jan 28, 2023

Uh oh!

innat commented Jan 29, 2023

Uh oh!

innat commented Aug 4, 2023

Uh oh!

google-ml-butler bot commented Aug 4, 2023

Uh oh!

innat commented May 17, 2021 •

edited

Loading