-
Notifications
You must be signed in to change notification settings - Fork 64
Rebuild all CUDA software with EB-5.1.1 #1147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…y check, so we can see if anything is 'broken'. Also, there are so many 'holes' in which software is present for which combination of CPU+GPU, that this is a convenient way to fill the gaps
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-surf architecture:x86_64/intel/icelake accelerator:nvidia/cc80 |
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-surf architecture:x86_64/amd/zen4 accelerator:nvidia/cc90 |
New job on instance
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-surf architecture:x86_64/amd/zen4 accelerator:nvidia/cc90 |
New job on instance
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-surf architecture:x86_64/amd/zen4 accelerator:nvidia/cc90 |
New job on instance
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-surf architecture:x86_64/amd/zen4 accelerator:nvidia/cc90 |
New job on instance
|
Hmmm, CUDA builds fail with:
Those are the files that are symlinked from host-injections, probably (at least |
Ah, found the issue:
Note that in the host-injections dir, the whole
but
|
https://github.com/EESSI/software-layer-scripts/blob/41f3775bfe214ecc51af2ea88f914d93414ed87b/eb_hooks.py#L1310 this is the line where it happens. Might actually be an issue with the setting of the
It seems strange that both are identical, I think the code expected |
Apparently, that is totally expected. Here, it's essentially set to the same value by https://github.com/EESSI/software-layer-scripts/blob/41f3775bfe214ecc51af2ea88f914d93414ed87b/init/modules/EESSI/2023.06.lua#L77 and https://github.com/EESSI/software-layer-scripts/blob/41f3775bfe214ecc51af2ea88f914d93414ed87b/init/modules/EESSI/2023.06.lua#L157 |
I think the bug is here. The |
@casparvl you are correct, the bot was previously setting the accelerator override in a way that did not include the |
This PR is on hold until EESSI/software-layer-scripts#59 is merged |
There are two reasons for this: