Skip to content

Bug: Inconsistent ggml-4-x86-cuda-v100 ci failures on master #7613

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mofosyne opened this issue May 29, 2024 · 3 comments
Closed

Bug: Inconsistent ggml-4-x86-cuda-v100 ci failures on master #7613

mofosyne opened this issue May 29, 2024 · 3 comments
Labels
bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches) stale

Comments

@mofosyne
Copy link
Collaborator

mofosyne commented May 29, 2024

Note: Only one datapoint of ci failure, but it would be important to keep track of this behavior over the next few commits

What happened?

Noticed that it said it's failing in 20 - test-backend-ops, it be good to identify the cause of this issue and potential ways to fix it. The failure in test #20 in test-backend-ops looked like below which doesn't seem to explain much to me. But hopefully it makes sense to someone else here.

[CPY] NMSE = 0.000003149 > 0.000000100 looks interesting however

Name and Version

between commit 504f0c3 and 0e8d8bf

What operating system are you seeing the problem on?

Other? (Please let us know in description)

Relevant log output

�[1;32mOK�[0m
  CPY(type_src=f32,type_dst=q4_1,ne=[256,4,4,4]): ggml_backend_cuda_graph_compute: disabling CUDA graphs due to GPU architecture
ggml_backend_cuda_graph_compute: disabling CUDA graphs due to GPU architecture
[CPY] NMSE = 0.000003149 > 0.000000100 ggml_backend_cuda_graph_compute: disabling CUDA graphs due to GPU architecture
ggml_backend_cuda_graph_compute: disabling CUDA graphs due to GPU architecture
ggml_backend_cuda_graph_compute: disabling CUDA graphs due to GPU architecture
�[1;31mFAIL�[0m
  CPY(type_src=f32,type_dst=q5_0,ne=[256,4,4,4]): ggml_backend_cuda_graph_compute: disabling CUDA graphs due to GPU architecture
ggml_backend_cuda_graph_compute: disabling CUDA graphs due to GPU architecture
ggml_backend_cuda_graph_compute: disabling CUDA graphs due to GPU architecture
ggml_backend_cuda_graph_compute: disabling CUDA graphs due to GPU architecture
ggml_backend_cuda_graph_compute: disabling CUDA graphs due to GPU architecture
�[1;32mOK�[0m
@mofosyne mofosyne added bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches) labels May 29, 2024
@ggerganov
Copy link
Member

See this comment for explanation: #7425 (comment)

@mofosyne
Copy link
Collaborator Author

mofosyne commented May 29, 2024

Is there a way to reasonably suppress, change the rounding approach, increase threshold or adapt to this error (e.g. only trigger an error if two or more trips in a test?)

@github-actions github-actions bot added the stale label Jun 29, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches) stale
Projects
None yet
Development

No branches or pull requests

2 participants