common : use cpp-httplib as a cURL alternative for downloads #16185

angt · 2025-09-22T23:17:35Z

This is a draft that uses httplib to download, mostly copied from the existing cURL implementation.
To test, build with -DLLAMA_CURL=OFF.
Some features might be missing for now, but it's a starting point.

common/CMakeLists.txt

angt · 2025-09-24T13:09:37Z

This is the one that concerns me, since cpp-httplib is currently a required dependency of llama.cpp:

error: "cpp-httplib doesn't support Windows 8 or lower. Please use Windows 10 or later."

ggerganov · 2025-09-24T13:14:54Z

This is the one that concerns me, since cpp-httplib is currently a required dependency of llama.cpp:
error: "cpp-httplib doesn't support Windows 8 or lower. Please use Windows 10 or later."

It shouldn't be too difficult to add LLAMA_HTTPLIB option and do the same thing we do currently on master when LLAMA_CURL=OFF?

ggerganov

Looks good. The biggest unknown for me is the Windows workflow - building and releases. I suppose whatever we currently do to provide libcurl we have to do for libssl.

If you plan to bring this to completion, feel free to add yourself to the CODEOWNERS. I see the following TODOs:

Extract file downloading implementation from common/arg.cpp to common/download.cpp
Remove CURL dependency (+ figure out how to build on Windows)
Remove json dependency from common/download.cpp
Add CMake option to build without httplib for old Windows support

ggerganov · 2025-09-24T13:12:01Z

common/arg.cpp

+static void write_metadata(const std::string & path,
+                           const std::string & url,
+                           const common_file_metadata & metadata) {
+    nlohmann::json metadata_json = {
+        { "url",          url                    },
+        { "etag",         metadata.etag          },
+        { "lastModified", metadata.last_modified }
+    };
+
+    write_file(path, metadata_json.dump(4));
+    LOG_DBG("%s: file metadata saved: %s\n", __func__, path.c_str());
+}


Same comment as in the previous PR about the json stuff: I hope eventually we will avoid using json for this component - it's a pity we started doing it in the first place.

We definitely can remove json here, in fact we can just read/write the etag it's enough.

angt · 2025-09-24T13:47:50Z

This is the one that concerns me, since cpp-httplib is currently a required dependency of llama.cpp:
error: "cpp-httplib doesn't support Windows 8 or lower. Please use Windows 10 or later."
It shouldn't be too difficult to add LLAMA_HTTPLIB option and do the same thing we do currently on master when LLAMA_CURL=OFF?

The windows issue comes from updating httplib (this PR yhirose/cpp-httplib#2177).
I don’t think keeping the old version would be a good idea, and I don’t believe it’s reasonable to support Windows 8 without llama-server ?

Possible solutions could be either patching httplib to restore Windows 8 compatibility, or switching to another HTTP library.

ggerganov · 2025-09-24T14:49:58Z

I don’t think keeping the old version would be a good idea

Yes, we should stick with the latest version of httplib.

I don’t believe it’s reasonable to support Windows 8 without llama-server ?

The idea is when LLAMA_HTTPLIB=OFF to build empty download functions that will simply print an error that downloading is not supported. Windows 8 can still run llama-server - it just won't be able to download models.

I suspect that these failing CI workflows are currently happening only for the msys/mingw toolchain. Likely there is a simple fix by tuning the WIN32 preprocessor macros to make httplib happy. Note that the runners are not actually using Windows 8, so it's some sort of mis-detection. Worst case, I think we can safely disable downloading capabilities for these specific builds.

ggerganov · 2025-09-25T08:25:38Z

Regarding 547fa26 - I suppose this is temporary? We want to keep the upstream version unchanged, so any modifications should be first upstreamed to the original repo.

angt · 2025-09-25T08:31:46Z

Regarding 547fa26 - I suppose this is temporary? We want to keep the upstream version unchanged, so any modifications should be first upstreamed to the original repo.

Yes, this was only to confirm that everything builds correctly with it.

angt · 2025-09-25T08:59:14Z

Since cpp-httplib is mandatory for llama-server (with or without the model downloader), we can bump _WIN32_WINNT to 0x0A00 to align with the current restriction.

ggerganov · 2025-09-25T09:02:41Z

Since cpp-httplib is mandatory for llama-server

Oh right, I missed that when I wrote the comment earlier.

we can bump _WIN32_WINNT to 0x0A00 to align with the current restriction.

Yes, let's give this a try.

angt · 2025-09-25T11:57:51Z

Note @ggerganov : I've tested the version including commit 547fa26 (cpp-httplib: allow _WIN32_WINNT >= 0x0602) and it works fine under Wine. There should be no issue retargeting Windows 8 if needed.

$ wine build/bin/llama-server.exe -hf unsloth/Qwen3-4B-Instruct-2507-GGUF:Q4_0
it looks like wine32 is missing, you should install it.
multiarch needs to be enabled first.  as root, please
execute "dpkg --add-architecture i386 && apt-get update &&
apt-get install wine32:i386"
0048:err:winediag:nodrv_CreateWindow Application tried to create a window, but no driver could be loaded.
0048:err:winediag:nodrv_CreateWindow L"The explorer process failed to start."
0048:err:systray:initialize_systray Could not create tray window
common_download_file_single_online: no previous model file found C:\users\angt\AppData\Local\llama.cpp\unsloth_Qwen3-4B
-Instruct-2507-GGUF_Qwen3-4B-Instruct-2507-Q4_0.gguf
common_download_file_single_online: trying to download model from https://huggingface.co/unsloth/Qwen3-4B-Instruct-2507
-GGUF/resolve/main/Qwen3-4B-Instruct-2507-Q4_0.gguf to C:\users\angt\AppData\Local\llama.cpp\unsloth_Qwen3-4B-Instruct-
2507-GGUF_Qwen3-4B-Instruct-2507-Q4_0.gguf.downloadInProgress (server_etag:"aaf2d7f5827dd5d918cc73fefac1d96c704f6b6cd3d
2c36d1e9f5c3ac675d94f", server_last_modified:)...
[>                              ^C                  ]   0%  (15 MB / 2265 MB)

And no issues when linking libssl statically on Windows :)

$ peldd build/bin/llama-server.exe
Dependencies
    ADVAPI32.dll
    bcrypt.dll
    CRYPT32.dll
    KERNEL32.dll
    msvcrt.dll
    WS2_32.dll

ggerganov · 2025-09-25T15:58:48Z

Hm, the address sanitizer is acting up. Not sure if related though.

angt · 2025-09-25T16:06:45Z

Hm, the address sanitizer is acting up. Not sure if related though.

I think the binaries were compiled with flags that are not supported by the emulator. All the errors are ILLEGAL:

The following tests FAILED:
	  1 - test-tokenizer-0-bert-bge (ILLEGAL)               main
Errors while running CTest
Output from these tests are in: /home/runner/work/llama.cpp/llama.cpp/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
	  2 - test-tokenizer-0-command-r (ILLEGAL)              main
	  3 - test-tokenizer-0-deepseek-coder (ILLEGAL)         main
	  4 - test-tokenizer-0-deepseek-llm (ILLEGAL)           main
	  5 - test-tokenizer-0-falcon (ILLEGAL)                 main
	  6 - test-tokenizer-0-gpt-2 (ILLEGAL)                  main
	  7 - test-tokenizer-0-llama-bpe (ILLEGAL)              main
	  8 - test-tokenizer-0-llama-spm (ILLEGAL)              main
	  9 - test-tokenizer-0-mpt (ILLEGAL)                    main
	 10 - test-tokenizer-0-phi-3 (ILLEGAL)                  main
	 11 - test-tokenizer-0-qwen2 (ILLEGAL)                  main
	 12 - test-tokenizer-0-refact (ILLEGAL)                 main
	 13 - test-tokenizer-0-starcoder (ILLEGAL)              main
	 21 - test-tokenizer-1-llama-spm (ILLEGAL)              main
	 27 - test-thread-safety (ILLEGAL)                      main
	 28 - test-arg-parser (ILLEGAL)                         main
	 29 - test-gguf (ILLEGAL)                               main
	 30 - test-backend-ops (ILLEGAL)                        main
	 33 - test-barrier (ILLEGAL)                            main
	 34 - test-quantize-fns (ILLEGAL)                       main
	 35 - test-quantize-perf (ILLEGAL)                      main
	 36 - test-rope (ILLEGAL)                               main

I guess it’s just random hardware selection. I can dig into that later.

ggerganov · 2025-09-25T16:11:02Z

I restarted the workflows. If CI is green, I think we are good to merge, correct?

angt · 2025-09-25T16:29:29Z

I restarted the workflows. If CI is green, I think we are good to merge, correct?

Yes! I can refactor (downloader.cpp) and remove json just after

angt · 2025-09-25T17:23:46Z

Lucky: https://github.com/ggml-org/llama.cpp/actions/runs/18009748371/job/51252952858?pr=16185

I think i can fix https://github.com/ggml-org/llama.cpp/actions/runs/18009748371/job/51252952728?pr=16185

ggerganov · 2025-09-25T17:28:49Z

I think i can fix https://github.com/ggml-org/llama.cpp/actions/runs/18009748371/job/51252952728?pr=16185

Yup, this should be fixed before merging.

slaren · 2025-09-25T17:33:45Z

I cleared the ccache cache of the sanitizer test before you re-ran the CI, I suspect that was the cause.

ggerganov · 2025-09-25T18:38:26Z

.github/workflows/build.yml

+            -DCMAKE_SYSTEM_NAME=Linux \
+            -DGGML_CCACHE=OFF \
+            -DGGML_NATIVE=OFF \


I'm don't think I understand how this change fixed the CI for ubuntu-cpu-make?

I was a bit extreme on this one and tried everything I could think of to make it work.

The tricky part is CMAKE_SYSTEM_NAME=Linux, which makes CMake believe we are cross-compiling (setting CMAKE_CROSSCOMPILING) but without breaking everything else. With that flag, i can disable GGML_NATIVE_DEFAULT:

llama.cpp/ggml/CMakeLists.txt

Lines 98 to 103 in 835b2b9

if (CMAKE_CROSSCOMPILING OR DEFINED ENV{SOURCE_DATE_EPOCH})

message(STATUS "Setting GGML_NATIVE_DEFAULT to OFF")

set(GGML_NATIVE_DEFAULT OFF)

else()

set(GGML_NATIVE_DEFAULT ON)

endif()

then disabling GGML_NATIVE allows us not to set INS_ENB:

llama.cpp/ggml/CMakeLists.txt

Lines 134 to 138 in 835b2b9

if (GGML_NATIVE OR NOT GGML_NATIVE_DEFAULT)

set(INS_ENB OFF)

else()

set(INS_ENB ON)

endif()

This way we get the lowest CPU mode. But honestly, -DGGML_NATIVE=OFF should be enough, and I also disabled ccache to increase my chances :)

angt · 2025-09-26T10:16:21Z

@ggerganov I believe we’re good now, right?

ggerganov · 2025-09-26T10:20:46Z

The ubuntu-cpu-make workflow used to be successful without GGML_NATIVE=OFF, so not sure this last commit is needed.

Is it possible that the ccache clearing also affected this workflow and we simply had to rerun it in the first place?

angt · 2025-09-26T10:32:37Z

The ubuntu-cpu-make workflow used to be successful without GGML_NATIVE=OFF, so not sure this last commit is needed.

Is it possible that the ccache clearing also affected this workflow and we simply had to rerun it in the first place?

I believe disabling GGML_NATIVE in CI/CD should be the standard approach to ensure reproducibility when that feature is not explicitly tested.
I can remove the commit, but I expect the workflow will still be randomly green/red as before.

angt · 2025-09-26T10:36:16Z

We can #16257 before so i can rebase to cleanup this PR

Signed-off-by: Adrien Gallouët <[email protected]>

The existing cURL implementation is intentionally left untouched to prevent any regressions and to allow for safe, side-by-side testing by toggling the `LLAMA_CURL` CMake option. Signed-off-by: Adrien Gallouët <[email protected]>

Signed-off-by: Adrien Gallouët <[email protected]>

angt · 2025-09-26T10:52:44Z

Was not lucky this time: https://github.com/ggml-org/llama.cpp/actions/runs/18035538657/job/51321531237?pr=16185.

angt · 2025-09-26T10:59:50Z

Do I make a dedicated PR to disable GGML_NATIVE in the CI to avoid this kind of issue?

ggerganov · 2025-09-26T11:12:00Z

Yes, let's push this change in a separate PR.

I tried to reproduce these failures locally on my Ubuntu machine, but I can't. And I don't think this workflow has ever failed before like this. So I am still confused why it started happening now.

If it is related to ccache I wonder why it is not happening to other workflows that do not have GGML_NATIVE=OFF. Do we need to disable native builds on all workflows too? Does not seem like a good solution.

…g#16185) * vendor : update httplib Signed-off-by: Adrien Gallouët <[email protected]> * common : use cpp-httplib as a cURL alternative for downloads The existing cURL implementation is intentionally left untouched to prevent any regressions and to allow for safe, side-by-side testing by toggling the `LLAMA_CURL` CMake option. Signed-off-by: Adrien Gallouët <[email protected]> * ggml : Bump to Windows 10 Signed-off-by: Adrien Gallouët <[email protected]> --------- Signed-off-by: Adrien Gallouët <[email protected]>

angt requested a review from ggerganov as a code owner September 22, 2025 23:17

angt force-pushed the use-cpp-httplib-as-a-curl-alternative-for-downloads branch 3 times, most recently from 0201e99 to e1f545f Compare September 23, 2025 10:49

ggerganov reviewed Sep 24, 2025

View reviewed changes

common/CMakeLists.txt Outdated Show resolved Hide resolved

ggerganov approved these changes Sep 24, 2025

View reviewed changes

angt requested a review from CISC as a code owner September 25, 2025 07:15

github-actions bot added the devops improvements to build systems and github actions label Sep 25, 2025

angt mentioned this pull request Sep 25, 2025

Implement progress bar and multi-connection downloads #16196

Open

angt force-pushed the use-cpp-httplib-as-a-curl-alternative-for-downloads branch from 1f97bec to aad19ef Compare September 25, 2025 09:42

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Sep 25, 2025

angt force-pushed the use-cpp-httplib-as-a-curl-alternative-for-downloads branch from aad19ef to e7b5f55 Compare September 25, 2025 13:51

angt mentioned this pull request Sep 25, 2025

build : fix build-ios-device #16257

Merged

ggerganov reviewed Sep 25, 2025

View reviewed changes

angt force-pushed the use-cpp-httplib-as-a-curl-alternative-for-downloads branch from 1e653fa to 1c92441 Compare September 26, 2025 10:44

angt requested a review from danbev as a code owner September 26, 2025 10:44

angt added 3 commits September 26, 2025 12:45

vendor : update httplib

0555c35

Signed-off-by: Adrien Gallouët <[email protected]>

ggml : Bump to Windows 10

6c53aef

Signed-off-by: Adrien Gallouët <[email protected]>

angt force-pushed the use-cpp-httplib-as-a-curl-alternative-for-downloads branch from 1c92441 to 6c53aef Compare September 26, 2025 10:45

ggerganov merged commit b995a10 into ggml-org:master Sep 26, 2025
59 of 63 checks passed

This was referenced Sep 26, 2025

chore: ⬆️ Update ggml-org/llama.cpp to 72b24d96c6888c609d562779a23787304ae4609c mudler/LocalAI#6345

Closed

Compile bug: undefined reference to SSL_ctrl when building with -DLLAMA_CURL=OFF #16285

Closed

ggerganov mentioned this pull request Oct 2, 2025

ci : attempt to fix ubuntu-latest-cmake-rpc #16388

Merged

	if (CMAKE_CROSSCOMPILING OR DEFINED ENV{SOURCE_DATE_EPOCH})
	message(STATUS "Setting GGML_NATIVE_DEFAULT to OFF")
	set(GGML_NATIVE_DEFAULT OFF)
	else()
	set(GGML_NATIVE_DEFAULT ON)
	endif()

	if (GGML_NATIVE OR NOT GGML_NATIVE_DEFAULT)
	set(INS_ENB OFF)
	else()
	set(INS_ENB ON)
	endif()

common : use cpp-httplib as a cURL alternative for downloads #16185

common : use cpp-httplib as a cURL alternative for downloads #16185

Uh oh!

Conversation

angt commented Sep 22, 2025

Uh oh!

Uh oh!

angt commented Sep 24, 2025

Uh oh!

ggerganov commented Sep 24, 2025

Uh oh!

ggerganov left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggerganov Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

angt Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

angt commented Sep 24, 2025

Uh oh!

ggerganov commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ggerganov commented Sep 25, 2025

Uh oh!

angt commented Sep 25, 2025

Uh oh!

angt commented Sep 25, 2025

Uh oh!

ggerganov commented Sep 25, 2025

Uh oh!

angt commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ggerganov commented Sep 25, 2025

Uh oh!

angt commented Sep 25, 2025

Uh oh!

ggerganov commented Sep 25, 2025

Uh oh!

angt commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

angt commented Sep 25, 2025

Uh oh!

ggerganov commented Sep 25, 2025

Uh oh!

slaren commented Sep 25, 2025

Uh oh!

ggerganov Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

angt Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

angt commented Sep 26, 2025

Uh oh!

ggerganov commented Sep 26, 2025

Uh oh!

angt commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

angt commented Sep 26, 2025

Uh oh!

angt commented Sep 26, 2025

Uh oh!

angt commented Sep 26, 2025

Uh oh!

ggerganov commented Sep 26, 2025

Uh oh!

Uh oh!

Uh oh!

ggerganov left a comment •

edited

Loading

ggerganov commented Sep 24, 2025 •

edited

Loading

angt commented Sep 25, 2025 •

edited

Loading

angt commented Sep 25, 2025 •

edited

Loading

angt commented Sep 26, 2025 •

edited

Loading