You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We need to do that for the architecture combinations:
zen2 + cc80
zen3 + cc80
zen4 + cc90
For the first two we use the build cluster on AWS. For the third we use the build cluster on Azure. Because CUDA is just a binary installation, this should be fine.
Note, while we only need to rebuild the module files, we cannot use --module-only as EasyBuild argument because the rebuild procedure removes the whole installation.
Updates by the bot instance eessi-bot-mc-aws(click for details)
received bot command build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4 accelerator:nvidia/cc90 from trz42
received bot command build instance:eessi-bot-mc-aws repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 accelerator:nvidia/cc80 from trz42
received bot command build instance:eessi-bot-mc-aws repository:eessi.io-2023.06-software architecture:x86_64/amd/zen2 accelerator:nvidia/cc80 from trz42
Updates by the bot instance eessi-bot-mc-azure(click for details)
received bot command build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4 accelerator:nvidia/cc90 from trz42
received bot command build instance:eessi-bot-mc-aws repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 accelerator:nvidia/cc80 from trz42
received bot command build instance:eessi-bot-mc-aws repository:eessi.io-2023.06-software architecture:x86_64/amd/zen2 accelerator:nvidia/cc80 from trz42
Updates by the bot instance eessi-bot-vsc-ugent(click for details)
received bot command build instance:eessi-bot-mc-azure repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4 accelerator:nvidia/cc90 from trz42
received bot command build instance:eessi-bot-mc-aws repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 accelerator:nvidia/cc80 from trz42
received bot command build instance:eessi-bot-mc-aws repository:eessi.io-2023.06-software architecture:x86_64/amd/zen2 accelerator:nvidia/cc80 from trz42
New job on instance eessi-bot-mc-azure for CPU micro-architecture x86_64-amd-zen4 and accelerator nvidia/cc90 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2025.02/pr_919/1085
date
job status
comment
Feb 17 12:50:41 UTC 2025
submitted
job id 1085 awaits release by job manager
Feb 17 12:50:51 UTC 2025
released
job awaits launch by Slurm scheduler
Feb 17 12:56:54 UTC 2025
running
job 1085 is running
Feb 17 13:56:16 UTC 2025
finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-1085.out ✅ no message matching FATAL: ✅ no message matching ERROR: ✅ no message matching FAILED: ✅ no message matching required modules missing: ✅ found message(s) matching No missing installations ✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen4-1739798555.tar.gzsize: 4373 MiB (4585648527 bytes) entries: 11757 modules under 2023.06/software/linux/x86_64/amd/zen4/accel/nvidia/cc90/modules/all
CUDA/12.1.1.lua CUDA/12.4.0.lua
software under 2023.06/software/linux/x86_64/amd/zen4/accel/nvidia/cc90/software
CUDA/12.1.1 CUDA/12.4.0
other under 2023.06/software/linux/x86_64/amd/zen4/accel/nvidia/cc90
no other files in tarball
Feb 17 13:56:16 UTC 2025
test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-1085.out ❌ found message matching ERROR: ✅ no message matching [\s*FAILED\s*].*Ran .* test case
Feb 18 09:18:47 UTC 2025
uploaded
transfer of eessi-2023.06-software-linux-x86_64-amd-zen4-1739798555.tar.gz to S3 bucket succeeded
New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-amd-zen3 and accelerator nvidia/cc80 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2025.02/pr_919/46531
date
job status
comment
Feb 17 12:50:42 UTC 2025
submitted
job id 46531 awaits release by job manager
Feb 17 12:51:39 UTC 2025
released
job awaits launch by Slurm scheduler
Feb 17 12:59:44 UTC 2025
running
job 46531 is running
Feb 17 14:01:56 UTC 2025
finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-46531.out ✅ no message matching FATAL: ✅ no message matching ERROR: ✅ no message matching FAILED: ✅ no message matching required modules missing: ✅ found message(s) matching No missing installations ✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen3-1739798896.tar.gzsize: 4373 MiB (4585650573 bytes) entries: 11757 modules under 2023.06/software/linux/x86_64/amd/zen3/accel/nvidia/cc80/modules/all
CUDA/12.1.1.lua CUDA/12.4.0.lua
software under 2023.06/software/linux/x86_64/amd/zen3/accel/nvidia/cc80/software
CUDA/12.1.1 CUDA/12.4.0
other under 2023.06/software/linux/x86_64/amd/zen3/accel/nvidia/cc80
no other files in tarball
Feb 17 14:01:56 UTC 2025
test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-46531.out ❌ found message matching ERROR: ✅ no message matching [\s*FAILED\s*].*Ran .* test case
Feb 18 09:18:35 UTC 2025
uploaded
transfer of eessi-2023.06-software-linux-x86_64-amd-zen3-1739798896.tar.gz to S3 bucket succeeded
New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-amd-zen2 and accelerator nvidia/cc80 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2025.02/pr_919/46532
test step failed with
ERROR: failed to load configuration: could not find a configuration entry for the requested system/partition combination: 'BotBuildTests:x86_64_amd_zen2_nvidia_cc80'
Log file(s) saved in '/tmp/tmp.skuL75P5X3/rfm-0an7_lvk.log'
ESC[31mERROR: Failed to list ReFrame tests with command: reframe --tag CI --tag 1_node --nocolor -n EESSI_OSU -n EESSI_LAMMPS --listESC[0m
date
job status
comment
Feb 17 12:50:46 UTC 2025
submitted
job id 46532 awaits release by job manager
Feb 17 12:51:37 UTC 2025
released
job awaits launch by Slurm scheduler
Feb 17 12:59:42 UTC 2025
running
job 46532 is running
Feb 17 14:11:08 UTC 2025
finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-46532.out ✅ no message matching FATAL: ✅ no message matching ERROR: ✅ no message matching FAILED: ✅ no message matching required modules missing: ✅ found message(s) matching No missing installations ✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen2-1739799110.tar.gzsize: 4373 MiB (4585663906 bytes) entries: 11757 modules under 2023.06/software/linux/x86_64/amd/zen2/accel/nvidia/cc80/modules/all
CUDA/12.1.1.lua CUDA/12.4.0.lua
software under 2023.06/software/linux/x86_64/amd/zen2/accel/nvidia/cc80/software
CUDA/12.1.1 CUDA/12.4.0
other under 2023.06/software/linux/x86_64/amd/zen2/accel/nvidia/cc80
no other files in tarball
Feb 17 14:11:08 UTC 2025
test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-46532.out ❌ found message matching ERROR: ✅ no message matching [\s*FAILED\s*].*Ran .* test case
Feb 18 09:19:37 UTC 2025
uploaded
transfer of eessi-2023.06-software-linux-x86_64-amd-zen2-1739799110.tar.gz to S3 bucket succeeded
PR merged! Moved ['/project/def-users/SHARED/jobs/2025.02/pr_919/46531', '/project/def-users/SHARED/jobs/2025.02/pr_919/46532'] to /project/def-users/SHARED/trash_bin/EESSI/software-layer/2025.02.18
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Renewed version for #918
After easybuilders/easybuild-easyblocks#3516 got merged we need to update the module files for CUDA/12.{1.1,4.0}
We need to do that for the architecture combinations:
zen2
+cc80
zen3
+cc80
zen4
+cc90
For the first two we use the build cluster on AWS. For the third we use the build cluster on Azure. Because CUDA is just a binary installation, this should be fine.
Note, while we only need to rebuild the module files, we cannot use
--module-only
as EasyBuild argument because the rebuild procedure removes the whole installation.