Skip to content

Conversation

yeahdongcn
Copy link
Collaborator

Make sure to read the contributing guidelines before submitting a PR

This PR introduces a minor performance improvement on MTGPU by applying updated compiler flags. It also addresses build warnings in recently updated files.

@github-actions github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Sep 26, 2025
@yeahdongcn yeahdongcn force-pushed the xd/musa_compile_flags branch 2 times, most recently from 3db0b7e to 766bbab Compare September 28, 2025 01:25
@yeahdongcn
Copy link
Collaborator Author

There are 3 failed CI tests, but they don’t seem related to this PR.

@yeahdongcn yeahdongcn force-pushed the xd/musa_compile_flags branch from 766bbab to 5c4459a Compare September 29, 2025 13:53
Signed-off-by: Xiaodong Ye <[email protected]>
@yeahdongcn yeahdongcn force-pushed the xd/musa_compile_flags branch from 5c4459a to 896455b Compare October 2, 2025 11:45
@yeahdongcn
Copy link
Collaborator Author

Just rebased on upstream/master to see if CI passes.

@yeahdongcn
Copy link
Collaborator Author

@JohannesGaessler Could you please help merge this? The failed CI cases don’t appear to be related to my changes.

@ggerganov ggerganov merged commit 91a2a56 into ggml-org:master Oct 2, 2025
63 of 68 checks passed
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 2, 2025
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 3, 2025
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request Oct 3, 2025
* origin/master: (124 commits)
metal : fix loop bound in ggml_mem_ranges (ggml-org#16412)
llama : fix shapes for bert/mpt q/k norm (ggml-org#16409)
ggml : fix graph reallocation with multiple chunks (ggml-org#16396)
Fix missing messages on sibling navigation (ggml-org#16408)
vulkan: Replace uses of maxMemoryAllocationSize and VK_WHOLE_SIZE (ggml-org#16354)
vulkan: Fix FA coopmat1 invalid array indexing (ggml-org#16365)
ci : change macos-13 to macos-15-intel (ggml-org#16401)
Capture model name only after first token (streaming) or completed request (ggml-org#16405)
vulkan: in flash attention, bounds check against nem1 (don't rely on GGML_KQ_MASK_PAD) (ggml-org#16316)
webui : Fix messages payload sent to chat completions (ggml-org#16402)
fix: track viewportHeight via window.innerHeight to avoid unwanted scrolling (ggml-org#16356)
test-barrier : do not use more threads than physically available (ggml-org#16389)
ggml webgpu: add support for soft_max, optimize rms_norm (ggml-org#16357)
model : Apertus model implementation (ggml-org#15852)
musa: update compile flags (ggml-org#16265)
ci : fix ubuntu-latest-cmake-rpc (disable ccache) (ggml-org#16388)
ci: update vulkan ci (ggml-org#16294)
ci : fix clean-up of old logs (ggml-org#16381)
SYCL: Update to oneAPI 2025.2 (ggml-org#16371)
HIP: add IMbackK to codeowner (ggml-org#16375)
...
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 4, 2025
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 5, 2025
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 7, 2025
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 7, 2025
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 9, 2025
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 9, 2025
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 11, 2025
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 11, 2025
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 11, 2025
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 12, 2025
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 13, 2025
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants