Skip to content

SVE support for exponential functions #15145

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

s-goto-11
Copy link

This PR introduces support for SVE(Scalable Vector Extensions) kernels for the exponential on the Arm architecture.

Verifying Features

This PR contains the SVE implementation of the exponential function used to compute the Activation and Softmax functions.
The values of the NEON and SVE implementations were compared sequentially, and it was confirmed that the values always roughly same.
We also verified that the perplexity matches between the NEON and SVE implementations.

Orginal(NEON) This PR(SVE)
6.6741 +/- 0.04126 6.6732 +/- 0.04125

performance check

Performance was measured with FX700.
Performance is improved as follows. The value is ggml_vec_soft_max_f32's cycles. (ggml_v_expf symbol is striped by optimization.)

batch size Original(NEON) This PR(SVE) Ratio
1 185,895,707 73,095,311 2.54
2 632,556,068 220,198,078 2.87
4 2,150,478,150 716,458,163 3.00
8 8,290,580,604 2,559,182,187 3.24

The command used to measure the performance is

llama-batched --model ${PATH_TO_MODEL} --prompt 'AI is going to' --parallel 8 --predict 128 --seed 0 --threads 12

Add const notation to variable pg
@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Aug 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant