Skip to content

enable all inference and train on Gaudi/Gaudi2 with optimized perf with latest base #139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 16, 2023

Conversation

LeoZhao-Intel
Copy link
Contributor

No description provided.

@Gy-Lu Gy-Lu requested a review from Shenggan January 16, 2023 03:17
@Shenggan Shenggan added the Run Build and Test Run test for pull request label Jan 16, 2023
@Shenggan Shenggan merged commit 76499df into hpcaitech:habana Jan 16, 2023
Shenggan added a commit that referenced this pull request Jan 16, 2023
* add habana

* add mask

* fix mask in outer_product_mean

* add dap

* add hmp

* merge training code

* add chunk for inference

* fix extra-msa stack for training

* support ddp in training

* fix inference bugs

* code refactoring for habana

* support hmp training

* enable all inference and train on Gaudi/Gaudi2 with optimized perf with latest base (#139)

* enable all inference and train on Gaudi/Gaudi2 with optimized perf

* refine code to adapt new base

* refine code to fix issues in code review

Co-authored-by: habanachina <[email protected]>

Co-authored-by: Leo Zhao <[email protected]>
Co-authored-by: habanachina <[email protected]>
@KatarinaYuan
Copy link

Hi,
Thanks for this amazing work! If possible, could you please share some benchmarking for the difference of the training and inference time on Inter Habana Cluster or not?

@LeoZhao-Intel
Copy link
Contributor Author

https://www.hpc-ai.tech/blog/intel-habana here is blog for some details. It could run on both Gaudi and Gaudi2 for train and inference, we would publish more details and benchmarks in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Run Build and Test Run test for pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants