Skip to content

Dev -> main #843

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 114 commits into from
Feb 11, 2025
Merged
Changes from all commits
Commits
Show all changes
114 commits
Select commit Hold shift + click to select a range
ce99901
feat: package updates with python311
init-22 Nov 14, 2024
21fb3f9
fix: absl package version change
init-22 Nov 14, 2024
67b9f15
fix: pytorch version change
init-22 Nov 14, 2024
78df36f
fix: tf version to use numpy < 2
init-22 Nov 14, 2024
2584416
fix: librispeech requirement of tf-text rolled back to v2.17
init-22 Nov 15, 2024
d603ce9
fix: using the main repo and branch for testing
init-22 Nov 15, 2024
be68f8c
fix: overflow error resolved and PRNGKey to key
init-22 Nov 16, 2024
e890c89
fix: minor changes in docs
init-22 Nov 20, 2024
1bc2a7b
fix: changing the python versions in workflow to pass the tests
init-22 Nov 30, 2024
7a0fee3
fix: changing numpy compatible version
init-22 Nov 30, 2024
7cdea16
adding key_data to check the CI tests
init-22 Nov 30, 2024
7264c3f
fix: updated packge of sacrebleu changed the way it used to work, hen…
init-22 Dec 1, 2024
abbdc82
fix: temporarily commenting tfa
init-22 Dec 1, 2024
86029a7
fix: explicitly using mask kwarg to use MultiHeadDotProductAttention …
init-22 Dec 2, 2024
aca45a2
fix: using flax.core.pop instead of variables.pop, better way to upda…
init-22 Dec 2, 2024
2618c5e
fix: changing the traindiffs_tests branch to main again
init-22 Dec 2, 2024
8c90625
fix: unfreeze() in test_param_shapes expect FrozenDict also added fla…
init-22 Dec 2, 2024
1b587b7
fix: formatting changes with yapf
init-22 Dec 3, 2024
c65d93e
fix: running yapf again with 0.32, earlier using 0.43
init-22 Dec 3, 2024
3afd1df
fix: running yapf again with 0.32, earlier using 0.43
init-22 Dec 3, 2024
6ff2010
fix: latest versions of typing dont support Text instead str is recom…
init-22 Dec 3, 2024
55bacbd
fix: minor yapf
init-22 Dec 3, 2024
cfd5a00
Merge branch 'python_upgrades' into python311
priyakasimbeg Dec 6, 2024
5eac985
fix: going back to sacrebleu v1.3.1
init-22 Dec 7, 2024
7867711
feat: custom tf_addons support in TF2.18
init-22 Dec 17, 2024
d6dd2e8
fix: resolving pylint issues in custom_tf_addons
init-22 Dec 17, 2024
a0b587a
resolved pyline and changed the pylint version to current version of …
init-22 Dec 17, 2024
9393145
fix: removing tensorflow addons from setup cfg
init-22 Dec 18, 2024
53eff1d
fix: adding absolute paths for custom_tf_addons in randaugment
init-22 Dec 19, 2024
9d1c957
Merge pull request #811 from init-22/python311
priyakasimbeg Dec 20, 2024
b29397a
update_docker
priyakasimbeg Dec 21, 2024
3d58d5b
add regression tests for target branch python_test_env_upgrade
priyakasimbeg Dec 21, 2024
8d966fe
add regression tests for target branch python_test_env_upgrade
priyakasimbeg Dec 21, 2024
d21d820
fix: changes jax.tree_map to jax.tree.map
init-22 Dec 22, 2024
785d82b
fix: MultiHeadDotProductAttention and optax ctc_loss changes
init-22 Dec 22, 2024
d4aa90a
fix: removed the sacrebleu dependency
init-22 Dec 22, 2024
5e348e4
fix: resolving pylint errors
init-22 Dec 22, 2024
21a3d03
Merge pull request #828 from init-22/custom_bleu
priyakasimbeg Jan 9, 2025
1b88a2e
Merge pull request #827 from init-22/resolve_deprecations
priyakasimbeg Jan 9, 2025
426e6ee
Merge branch 'python_test_env_upgrade' into python_upgrades
priyakasimbeg Jan 9, 2025
b769e6c
fix startup script for python version upgrade
priyakasimbeg Jan 9, 2025
b65157e
fix: getargspec is not supported in python311, using getfullargspec i…
init-22 Jan 14, 2025
8327283
Create equivalent pyproject toml
fsschneider Jan 15, 2025
c8dc704
yapf requires toml
fsschneider Jan 15, 2025
50658bc
Revert to auto-finding packages (includes `tests/`)
fsschneider Jan 15, 2025
616a0f4
Match version to GH
fsschneider Jan 15, 2025
bad76f5
Let `setuptools_scm` handle versioning.
fsschneider Jan 15, 2025
ff4a457
Fix version test
fsschneider Jan 15, 2025
f97c880
Match file name of version test to the other tests
fsschneider Jan 15, 2025
f98b554
Fix linting
fsschneider Jan 15, 2025
8171a32
Update version test to only check major and minor elements, excluding…
fsschneider Jan 15, 2025
96cc471
Rename job in regression tests workflow from `criteo_resnet_pytorch` …
fsschneider Jan 15, 2025
230bf84
Fix some markdown linting issues.
fsschneider Jan 15, 2025
ce44582
Add trailing new line
fsschneider Jan 15, 2025
37f556d
Rename package from `algorithmic-efficiency` to `algoperf`.
fsschneider Jan 15, 2025
bc666a7
Fix linting (due to shorter package name in imports)
fsschneider Jan 15, 2025
3d8e606
Merge branch 'dev' into python_upgrades
priyakasimbeg Jan 16, 2025
d9f13ab
upgrade_jax
priyakasimbeg Jan 16, 2025
1e62f15
Merge remote-tracking branch 'refs/remotes/origin/python_upgrades' in…
priyakasimbeg Jan 16, 2025
01eb881
change jax version
priyakasimbeg Jan 16, 2025
c9b6411
change jax python version
priyakasimbeg Jan 16, 2025
e713053
Merge branch 'dev' into python_upgrades
priyakasimbeg Jan 18, 2025
7d580f1
fix: using jax.random.key_data only when the workload is jax
init-22 Jan 18, 2025
5715618
revert to use PRNGKey
priyakasimbeg Jan 22, 2025
d57dec3
Merge remote-tracking branch 'refs/remotes/origin/python_upgrades' in…
priyakasimbeg Jan 22, 2025
3fb722d
revert changes to submission runner for prng key
priyakasimbeg Jan 22, 2025
9b7cee4
remove extracting key_data
priyakasimbeg Jan 22, 2025
5775ed1
cast np.int32 as int for random.Random arg
priyakasimbeg Jan 23, 2025
1352e70
fix: vim installation
init-22 Jan 26, 2025
d7eebf8
use inductor backend to compile deepspeech instead of eager
priyakasimbeg Jan 30, 2025
58159c5
adding mem_fraction 0.80 for jax workfloads to resolve OOM of certain…
init-22 Feb 2, 2025
81bc93d
mem fraction typo
init-22 Feb 3, 2025
f6ca2bc
env variable for conformer set at the top
init-22 Feb 3, 2025
59126ae
Update documentation with new targets
fsschneider Feb 3, 2025
ff0086c
Use 1.5 instead of 3x the budget for self-tuning
fsschneider Feb 3, 2025
03bc79e
Update max allowed runtimes for each workload
fsschneider Feb 3, 2025
16eb8d6
Clarify in comment that its the old budgets
fsschneider Feb 3, 2025
d3f788d
Adapt step hint as well
fsschneider Feb 3, 2025
b3dec67
Clarify `step_hint`
fsschneider Feb 3, 2025
b4ed6cc
set env variables for pytorch before initializing w ddp.
priyakasimbeg Feb 4, 2025
c1f182e
Merge pull request #830 from fsschneider/pyproject.toml
priyakasimbeg Feb 5, 2025
def1a43
Merge pull request #831 from fsschneider/version_bump
priyakasimbeg Feb 5, 2025
ebf0341
set jax to 0.4.26
priyakasimbeg Feb 5, 2025
1ce6dea
set jax versions
priyakasimbeg Feb 7, 2025
2a0f1c9
switch to pyproject toml
priyakasimbeg Feb 8, 2025
082be03
fix pytorch version
priyakasimbeg Feb 8, 2025
39bb876
fix jax versions
priyakasimbeg Feb 8, 2025
b45a69b
fix: adding wandb under 'full' section
init-22 Feb 8, 2025
b719a6e
fix: wandb version upgrade
init-22 Feb 10, 2025
1ce3e62
remove wandb from full
priyakasimbeg Feb 10, 2025
7261e49
Merge branch 'dev' into minor_nits
fsschneider Feb 10, 2025
b356844
Merge pull request #832 from fsschneider/minor_nits
priyakasimbeg Feb 10, 2025
9ee2d80
Merge branch 'dev' into rename_algoperf
fsschneider Feb 10, 2025
5969f1d
Merge pull request #833 from fsschneider/rename_algoperf
priyakasimbeg Feb 10, 2025
b4f6cb1
Merge branch 'dev' into update_budgets
fsschneider Feb 10, 2025
738658c
Merge pull request #838 from fsschneider/update_budgets
priyakasimbeg Feb 10, 2025
5be969b
fix isort version in test
priyakasimbeg Feb 10, 2025
cabcc59
Merge pull request #839 from mlcommons/lint_fix
priyakasimbeg Feb 10, 2025
c8e2546
Merge branch 'dev' into python_upgrades
priyakasimbeg Feb 11, 2025
a12733a
revert isort version
priyakasimbeg Feb 11, 2025
bee0e3f
revert import order changes
priyakasimbeg Feb 11, 2025
1f72cb3
remove temporary testing for upgrades
priyakasimbeg Feb 11, 2025
a4b10a5
Move stuff to docs
fsschneider Feb 11, 2025
466ad7a
Highlight rolling leaderboard
fsschneider Feb 11, 2025
4dadef0
update import path in randaugment.py
priyakasimbeg Feb 11, 2025
f63e906
Modify text with new repository
fsschneider Feb 11, 2025
dbefd09
Merge branch 'dev' of github.com:fsschneider/algorithmic-efficiency i…
fsschneider Feb 11, 2025
d86273c
Slight tweak
fsschneider Feb 11, 2025
1ee83e2
Update for rolling leaderboard
fsschneider Feb 11, 2025
f375099
isort changes
priyakasimbeg Feb 11, 2025
45a9b9a
Update for rolling leaderboard
fsschneider Feb 11, 2025
4345e8b
Merge pull request #840 from mlcommons/python_upgrades
priyakasimbeg Feb 11, 2025
e9d4342
Merge branch 'dev' of github.com:fsschneider/algorithmic-efficiency i…
fsschneider Feb 11, 2025
5c4c07d
Merge pull request #842 from fsschneider/dev
priyakasimbeg Feb 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 25 additions & 25 deletions .github/workflows/CI.yml
Original file line number Diff line number Diff line change
@@ -7,10 +7,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.9
- name: Set up Python 3.11.10
uses: actions/setup-python@v4
with:
python-version: 3.9
python-version: 3.11.10
cache: 'pip' # Cache pip dependencies\.
cache-dependency-path: '**/setup.py'
- name: Install Modules and Run
@@ -25,10 +25,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.9
- name: Set up Python 3.11.10
uses: actions/setup-python@v4
with:
python-version: 3.9
python-version: 3.11.10
cache: 'pip' # Cache pip dependencies\.
cache-dependency-path: '**/setup.py'
- name: Install Modules and Run
@@ -42,10 +42,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.9
- name: Set up Python 3.11.10
uses: actions/setup-python@v4
with:
python-version: 3.9
python-version: 3.11.10
cache: 'pip' # Cache pip dependencies\.
cache-dependency-path: '**/setup.py'
- name: Install Modules and Run
@@ -59,10 +59,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.9
- name: Set up Python 3.11.10
uses: actions/setup-python@v4
with:
python-version: 3.9
python-version: 3.11.10
cache: 'pip' # Cache pip dependencies\.
cache-dependency-path: '**/setup.py'
- name: Install Modules and Run
@@ -77,10 +77,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.9
- name: Set up Python 3.11.10
uses: actions/setup-python@v4
with:
python-version: 3.9
python-version: 3.11.10
cache: 'pip' # Cache pip dependencies\.
cache-dependency-path: '**/setup.py'
- name: Install Modules and Run
@@ -96,10 +96,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.9
- name: Set up Python 3.11.10
uses: actions/setup-python@v4
with:
python-version: 3.9
python-version: 3.11.10
cache: 'pip' # Cache pip dependencies\.
cache-dependency-path: '**/setup.py'
- name: Install Modules and Run
@@ -113,10 +113,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.9
- name: Set up Python 3.11.10
uses: actions/setup-python@v4
with:
python-version: 3.9
python-version: 3.11.10
cache: 'pip' # Cache pip dependencies\.
cache-dependency-path: '**/setup.py'
- name: Install Modules and Run
@@ -130,10 +130,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.9
- name: Set up Python 3.11.10
uses: actions/setup-python@v4
with:
python-version: 3.9
python-version: 3.11.10
cache: 'pip' # Cache pip dependencies\.
cache-dependency-path: '**/setup.py'
- name: Install Modules and Run
@@ -148,10 +148,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.9
- name: Set up Python 3.11.10
uses: actions/setup-python@v4
with:
python-version: 3.9
python-version: 3.11.10
cache: 'pip' # Cache pip dependencies\.
cache-dependency-path: '**/setup.py'
- name: Install Modules and Run
@@ -166,10 +166,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.9
- name: Set up Python 3.11.10
uses: actions/setup-python@v4
with:
python-version: 3.9
python-version: 3.11.10
cache: 'pip' # Cache pip dependencies\.
cache-dependency-path: '**/setup.py'
- name: Install Modules and Run
@@ -184,10 +184,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.9
- name: Set up Python 3.11.10
uses: actions/setup-python@v4
with:
python-version: 3.9
python-version: 3.11.10
cache: 'pip' # Cache pip dependencies\.
cache-dependency-path: '**/setup.py'
- name: Install pytest
@@ -199,7 +199,7 @@ jobs:
pip install .[pytorch_cpu]
- name: Run pytest tests
run: |
pytest -vx tests/version_test.py
pytest -vx tests/test_version.py
pytest -vx tests/test_num_params.py
pytest -vx tests/test_param_shapes.py
pytest -vx tests/test_param_types.py
@@ -208,10 +208,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.9
- name: Set up Python 3.11.10
uses: actions/setup-python@v4
with:
python-version: 3.9
python-version: 3.11.10
cache: 'pip' # Cache pip dependencies\.
cache-dependency-path: '**/setup.py'
- name: Install pytest
18 changes: 9 additions & 9 deletions .github/workflows/linting.yml
Original file line number Diff line number Diff line change
@@ -7,17 +7,17 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.9
- name: Set up Python 3.11.10
uses: actions/setup-python@v2
with:
python-version: 3.9
python-version: 3.11.10
- name: Install pylint
run: |
python -m pip install --upgrade pip
pip install pylint==2.16.1
- name: Run pylint
run: |
pylint algorithmic_efficiency
pylint algoperf
pylint reference_algorithms
pylint prize_qualification_baselines
pylint submission_runner.py
@@ -27,14 +27,14 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.9
- name: Set up Python 3.11.10
uses: actions/setup-python@v2
with:
python-version: 3.9
python-version: 3.11.10
- name: Install isort
run: |
python -m pip install --upgrade pip
pip install isort
pip install isort==5.12.0
- name: Run isort
run: |
isort . --check --diff
@@ -43,14 +43,14 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.9
- name: Set up Python 3.11.10
uses: actions/setup-python@v2
with:
python-version: 3.9
python-version: 3.11.10
- name: Install yapf
run: |
python -m pip install --upgrade pip
pip install yapf==0.32
pip install yapf==0.32 toml
- name: Run yapf
run: |
yapf . --diff --recursive
2 changes: 1 addition & 1 deletion .github/workflows/regression_tests_variants.yml
Original file line number Diff line number Diff line change
@@ -72,7 +72,7 @@ jobs:
run: |
docker pull us-central1-docker.pkg.dev/training-algorithms-external/mlcommons-docker-repo/algoperf_pytorch_${{ github.head_ref || github.ref_name }}
docker run -v $HOME/data/:/data/ -v $HOME/experiment_runs/:/experiment_runs -v $HOME/experiment_runs/logs:/logs --gpus all --ipc=host us-central1-docker.pkg.dev/training-algorithms-external/mlcommons-docker-repo/algoperf_pytorch_${{ github.head_ref || github.ref_name }} -d criteo1tb -f pytorch -s reference_algorithms/paper_baselines/adamw/pytorch/submission.py -w criteo1tb_resnet -t reference_algorithms/paper_baselines/adamw/tuning_search_space.json -e tests/regression_tests/adamw -m 10 -c False -o True -r false
criteo_resnet_pytorch:
criteo_embed_init_pytorch:
runs-on: self-hosted
needs: build_and_push_pytorch_docker_image
steps:
8 changes: 5 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -12,8 +12,8 @@ makefile
*.swp
*/data/
*events.out.tfevents*
algorithmic_efficiency/workloads/librispeech_conformer/data_dir
algorithmic_efficiency/workloads/librispeech_conformer/work_dir
algoperf/workloads/librispeech_conformer/data_dir
algoperf/workloads/librispeech_conformer/work_dir
*.flac
*.npy
*.csv
@@ -23,4 +23,6 @@ wandb/
scoring/plots/

!scoring/test_data/experiment_dir/study_0/mnist_jax/trial_0/eval_measurements.csv
!scoring/test_data/experiment_dir/study_0/mnist_jax/trial_1/eval_measurements.csv
!scoring/test_data/experiment_dir/study_0/mnist_jax/trial_1/eval_measurements.csv

algoperf/_version.py
58 changes: 36 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
@@ -6,12 +6,12 @@
</p>

<p align="center">
<a href="https://arxiv.org/abs/2306.07179" target="_blank">Paper (arXiv)</a> •
<a href="/CALL_FOR_SUBMISSIONS.md">Call for Submissions</a> •
<a href="/GETTING_STARTED.md">Getting Started</a> •
<a href="/COMPETITION_RULES.md">Competition Rules</a> •
<a href="/DOCUMENTATION.md">Documentation</a> •
<a href="/CONTRIBUTING.md">Contributing</a>
<a href="https://github.com/mlcommons/submissions_algorithms">Leaderboard</a> •
<a href="/docs/GETTING_STARTED.md">Getting Started</a> •
<a href="https://github.com/mlcommons/submissions_algorithms">Submit</a> •
<a href="/docs/DOCUMENTATION.md">Documentation</a> •
<a href="/docs/CONTRIBUTING.md">Contributing</a> •
<a href="https://arxiv.org/abs/2306.07179" target="_blank">Benchmark</a>/<a href="https://openreview.net/forum?id=CtM5xjRSfm" target="_blank">Results</a> Paper
</p>

[![CI](https://github.com/mlcommons/algorithmic-efficiency/actions/workflows/CI.yml/badge.svg)](https://github.com/mlcommons/algorithmic-efficiency/actions/workflows/CI.yml)
@@ -22,19 +22,21 @@

---

> *AlgoPerf* is a suite of benchmarks and competitions to measure neural network training speedups due to algorithmic improvements in both training algorithms and models. This is the repository for the *AlgoPerf: Training Algorithms benchmark* and its associated competition. It is developed by the [MLCommons Algorithms Working Group](https://mlcommons.org/en/groups/research-algorithms/). This repository holds the [**competition rules**](/COMPETITION_RULES.md), the [**technical documentation**](/DOCUMENTATION.md) of the benchmark, [**getting started guides**](/GETTING_STARTED.md), and the benchmark code. For a detailed description of the benchmark design, see our [**paper**](https://arxiv.org/abs/2306.07179).

> This is the repository for the *AlgoPerf: Training Algorithms benchmark* measuring neural network training speedups due to algorithmic improvements.
> It is developed by the [MLCommons Algorithms Working Group](https://mlcommons.org/en/groups/research-algorithms/).
> This repository holds the benchmark code, the benchmark's [**technical documentation**](/docs/DOCUMENTATION.md) and [**getting started guides**](/docs/GETTING_STARTED.md). For a detailed description of the benchmark design, see our [**introductory paper**](https://arxiv.org/abs/2306.07179), for the results of the inaugural competition see our [**results paper**](https://openreview.net/forum?id=CtM5xjRSfm).
>
> **See our [AlgoPerf Leaderboard](https://github.com/mlcommons/submissions_algorithms) for the latest results of the benchmark and to submit your algorithm.**
---

> [!IMPORTANT]
> The results of the inaugural AlgoPerf: Training Algorithms benchmark competition have been announced. See the [MLCommons blog post](https://mlcommons.org/2024/08/mlc-algoperf-benchmark-competition/) for an overview and the [results page](https://mlcommons.org/benchmarks/algorithms/) for more details on the results. We are currently preparing an in-depth analysis of the results in the form of a paper and plan the next iteration of the benchmark competition.
> For future iterations of the AlgoPerf: Training Algorithms benchmark competition, we are switching to a rolling leaderboard, making a few changes to the competition rules, and also run all selected submissions on our hardware. **To submit your algorithm to the next iteration of the benchmark, please see our [How to Submit](#how-to-submit) section and the [submission repository](https://github.com/mlcommons/submissions_algorithms) which hosts the up to date AlgoPerf leaderboard.**

## Table of Contents <!-- omit from toc -->

- [Installation](#installation)
- [Getting Started](#getting-started)
- [Call for Submissions](#call-for-submissions)
- [Competition Rules](#competition-rules)
- [How to Submit](#how-to-submit)
- [Technical Documentation of the Benchmark \& FAQs](#technical-documentation-of-the-benchmark--faqs)
- [Contributing](#contributing)
- [License](#license)
@@ -45,9 +47,9 @@
> [!TIP]
> **If you have any questions about the benchmark competition or you run into any issues, please feel free to contact us.** Either [file an issue](https://github.com/mlcommons/algorithmic-efficiency/issues), ask a question on [our Discord](https://discord.gg/5FPXK7SMt6) or [join our weekly meetings](https://mlcommons.org/en/groups/research-algorithms/).

You can install this package and dependencies in a [Python virtual environment](/GETTING_STARTED.md#python-virtual-environment) or use a [Docker/Singularity/Apptainer container](/GETTING_STARTED.md#docker) (recommended).
You can install this package and dependencies in a [Python virtual environment](/docs/GETTING_STARTED.md#python-virtual-environment) or use a [Docker/Singularity/Apptainer container](/docs/GETTING_STARTED.md#docker) (recommended).
We recommend using a Docker container (or alternatively, a Singularity/Apptainer container) to ensure a similar environment to our scoring and testing environments.
Both options are described in detail in the [**Getting Started**](/GETTING_STARTED.md) document.
Both options are described in detail in the [**Getting Started**](/docs/GETTING_STARTED.md) document.

*TL;DR to install the Jax version for GPU run:*

@@ -67,7 +69,7 @@ pip3 install -e '.[full]'

## Getting Started

For detailed instructions on developing and scoring your own algorithm in the benchmark see the [Getting Started](/GETTING_STARTED.md) document.
For detailed instructions on developing your own algorithm in the benchmark see the [Getting Started](/docs/GETTING_STARTED.md) document.

*TL;DR running a JAX workload:*

@@ -93,23 +95,19 @@ python3 submission_runner.py \
--tuning_search_space=reference_algorithms/paper_baselines/adamw/tuning_search_space.json
```

## Call for Submissions

The [Call for Submissions](/CALL_FOR_SUBMISSIONS.md) announces the first iteration of the AlgoPerf: Training Algorithms competition based on the benchmark by the same name. This document also contains the schedule and key dates for the competition.

### Competition Rules
## How to Submit

The competition rules for the *AlgoPerf: Training Algorithms* competition can be found in the separate [**Competition Rules**](/COMPETITION_RULES.md) document.
Once you have developed your training algorithm, you can submit it to the benchmark by creating a pull request to the [submission repository](https://github.com/mlcommons/submissions_algorithms), which hosts the AlgoPerf leaderboard. The AlgoPerf working group will review your PR. Based on our available resources and the perceived potential of the method, it will be selected for a free evaluation. If selected, we will run your algorithm on our hardware and update the leaderboard with the results.

### Technical Documentation of the Benchmark & FAQs

We provide additional technical documentation of the benchmark and answer frequently asked questions in a separate [**Documentation**](/DOCUMENTATION.md) page. Suggestions, clarifications and questions can be raised via pull requests, creating an issue, or by sending an email to the [working group](mailto:[email protected]).
We provide a technical documentation of the benchmark and answer frequently asked questions in a separate [**Documentation**](/docs/DOCUMENTATION.md) page. This includes which types of submissions are allowed. Please ensure that your submission is compliant with these rules before submitting. Suggestions, clarifications and questions can be raised via pull requests, creating an issue, or by sending an email to the [working group](mailto:[email protected]).

## Contributing

We invite everyone to look through our rules, documentation, and codebase and submit issues and pull requests, e.g. for rules changes, clarifications, or any bugs you might encounter. If you are interested in contributing to the work of the working group and influence the benchmark's design decisions, please [join the weekly meetings](https://mlcommons.org/en/groups/research-algorithms/) and consider becoming a member of the working group.

Our [**Contributing**](/CONTRIBUTING.md) document provides further MLCommons contributing guidelines and additional setup and workflow instructions.
Our [**Contributing**](/docs/CONTRIBUTING.md) document provides further MLCommons contributing guidelines and additional setup and workflow instructions.

## License

@@ -134,3 +132,19 @@ If you are using the *AlgoPerf benchmark*, its codebase, baselines, or workloads
eprint = {2306.07179},
}
```

If you use the results from the first *AlgoPerf competition*, please consider citing the results paper, as well as the relevant submissions:

> [Kasimbeg, Schneider, Eschenhagen, et al.<br/>
> **Accelerating neural network training: An analysis of the AlgoPerf competition**<br/>
> ICLR 2025](https://openreview.net/forum?id=CtM5xjRSfm)

```bibtex
@inproceedings{Kasimbeg2025AlgoPerfResults,
title = {Accelerating neural network training: An analysis of the {AlgoPerf} competition},
author = {Kasimbeg, Priya and Schneider, Frank and Eschenhagen, Runa and Bae, Juhan and Sastry, Chandramouli Shama and Saroufim, Mark and Boyuan, Feng and Wright, Less and Yang, Edward Z. and Nado, Zachary and Medapati, Sourabh and Hennig, Philipp and Rabbat, Michael and Dahl, George E.},
booktitle = {The Thirteenth International Conference on Learning Representations},
year = {2025},
url = {https://openreview.net/forum?id=CtM5xjRSfm}
}
```
5 changes: 5 additions & 0 deletions algoperf/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
"""Algorithmic Efficiency."""

from ._version import version as __version__

__all__ = ["__version__"]
Original file line number Diff line number Diff line change
@@ -16,8 +16,8 @@
from tensorflow.io import gfile # pytype: disable=import-error
import torch

from algorithmic_efficiency import spec
from algorithmic_efficiency.pytorch_utils import pytorch_setup
from algoperf import spec
from algoperf.pytorch_utils import pytorch_setup

_, _, DEVICE, _ = pytorch_setup()
CheckpointReturn = Tuple[spec.OptimizerState,
@@ -231,7 +231,7 @@ def save_checkpoint(framework: str,
target=checkpoint_state,
step=global_step,
overwrite=True,
keep=np.Inf if save_intermediate_checkpoints else 1)
keep=np.inf if save_intermediate_checkpoints else 1)
else:
if not save_intermediate_checkpoints:
checkpoint_files = gfile.glob(
Loading