Open
Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- I am running the latest code. Development is very rapid so there are no tagged versions as of now.I carefully followed the README.md.I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
Building llama-cpp-python with OpenCL CLBlast support (not CLBlast libs bundled with CUDA Toolkit) on Windows should work immediately without any additional steps.
Current Behavior
Loading llama.dll
fails unless CLBlast libs are added to PATH
. Removing cdll_args["winmode"] = 0
from llama_cpp.py
(Source) allows llama.dll
to successfully load using the CLBlast libs included in the package directory.
Environment and Context
i7-5820k
GTX 1080ti
Windows 10 19045
Conda 23.1.0
Python 3.10.11
MSVC 19.36.32537.0
CMake 3.27.0
Steps to Reproduce
- Build and install:
https://github.com/KhronosGroup/OpenCL-SDK.git -b v2023.04.17
https://github.com/CNugteren/CLBlast.git -b 1.6.1
- Use the following commands to build and install llama-cpp-python:
set "CMAKE_PREFIX_PATH=\path\to\CLBlast\root"
set "CMAKE_ARGS=-DLLAMA_CLBLAST=on"
set FORCE_CMAKE=1
set VERBOSE=1
python -m pip install git+https://github.com/abetlen/llama-cpp-python --no-cache-dir -v
Failure Logs
Using text-generation-webui to load:
Traceback (most recent call last):
File "G:\F\Projects\AI\text-generation-webui\one-click-installers-test\installer_files\env\lib\site-packages\llama_cpp\llama_cpp.py", line 67, in _load_shared_library
return ctypes.CDLL(str(_lib_path), **cdll_args)
File "G:\F\Projects\AI\text-generation-webui\one-click-installers-test\installer_files\env\lib\ctypes\__init__.py", line 374, in __init__
self._handle = _dlopen(self._name, mode)
FileNotFoundError: Could not find module 'G:\F\Projects\AI\text-generation-webui\one-click-installers-test\installer_files\env\Lib\site-packages\llama_cpp\llama.dll' (or one of its dependencies). Try using the full path with constructor syntax.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "G:\F\Projects\AI\text-generation-webui\one-click-installers-test\text-generation-webui\server.py", line 68, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
File "G:\F\Projects\AI\text-generation-webui\one-click-installers-test\text-generation-webui\modules\models.py", line 78, in load_model
output = load_func_map[loader](model_name)
File "G:\F\Projects\AI\text-generation-webui\one-click-installers-test\text-generation-webui\modules\models.py", line 232, in llamacpp_loader
from modules.llamacpp_model import LlamaCppModel
File "G:\F\Projects\AI\text-generation-webui\one-click-installers-test\text-generation-webui\modules\llamacpp_model.py", line 16, in <module>
from llama_cpp import Llama, LlamaCache, LogitsProcessorList
File "G:\F\Projects\AI\text-generation-webui\one-click-installers-test\installer_files\env\lib\site-packages\llama_cpp\__init__.py", line 1, in <module>
from .llama_cpp import *
File "G:\F\Projects\AI\text-generation-webui\one-click-installers-test\installer_files\env\lib\site-packages\llama_cpp\llama_cpp.py", line 80, in <module>
_lib = _load_shared_library(_lib_base_name)
File "G:\F\Projects\AI\text-generation-webui\one-click-installers-test\installer_files\env\lib\site-packages\llama_cpp\llama_cpp.py", line 69, in _load_shared_library
raise RuntimeError(f"Failed to load shared library '{_lib_path}': {e}")
RuntimeError: Failed to load shared library 'G:\F\Projects\AI\text-generation-webui\one-click-installers-test\installer_files\env\Lib\site-packages\llama_cpp\llama.dll': Could not find module 'G:\F\Projects\AI\text-generation-webui\one-click-installers-test\installer_files\env\Lib\site-packages\llama_cpp\llama.dll' (or one of its dependencies). Try using the full path with constructor syntax.
Activity
abetlen commentedon Sep 15, 2023
Hey @jllllll happy to help, sorry for the late reply here. Taking a look back at this the
winmode=0
thing was added to fix another windows + cuda issue in #208Would making it an environment variable that you can modify work for you?
jllllll commentedon Sep 15, 2023
That's fine. Though, I don't think that adding
winmode=0
ever fixed the original issue with CUDA.I think that this change is what actually fixed it: #225
As far as I can tell,
winmode=0
only restricts the Windows library search paths.Not too sure the extent of what it does, but it seems to exclude libraries adjacent to the one you are trying to load.
The docs for it's functionality are somewhat obtuse. According to the ctypes docs, this is the list of valid values for it:
https://learn.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-loadlibraryexa#parameters
0
does not seem to be listed there as a valid value, yet it does not cause an error. I can only assume that it is falling back to some other functionality. The Python ctypes docs used to say thatwinmode=0
was necessary on Windows, but they seem to have removed that.Honestly, I have very little idea of how
winmode
actually works. All I know is that removing it did not hinder any of my tests to load cuBLAS.abetlen commentedon Sep 15, 2023
You're right, I think I misread the docs there, does setting it to
ctypes.RTLD_GLOBAL
have any effect for you?jllllll commentedon Sep 15, 2023
ctypes.RTLD_GLOBAL
seems to be set to0
on Windows, so this produces the same behavior.I believe that is associated with the
mode
parameter, which isn't used on Windows.Interestingly, after redoing my tests, the CLBlast libs are not included with the package data as they were in my initial tests.
I can't figure out what I did differently at the time to get that to happen, so I'm just copying the CLBlast libs to the
site-packages\llama_cpp
directory for these tests.My hope with all this is to allow CLBlast to work without any additional steps beyond simply installing llama-cpp-python.
Though, I'm starting to think that this may be harder to achieve than I thought.jllllll commentedon Sep 15, 2023
Figured it out. Adding this to
CMakeLists.txt
results in the CLBlast libs being added to the package:$<TARGET_RUNTIME_DLLS:llama>
evaluates to an empty string on non-Windows systems.Haven't tested yet if this avoids modifying behavior on non-Windows.
Also doesn't seem to cause issues with cuBLAS. Doesn't include cuBLAS libs.
Biggest issue I've found so far is that this requires minimum CMake version of
3.21
.abetlen commentedon Sep 15, 2023
@jllllll I think that's okay, cmake is available as a pip package and scikit-build-core should use that if the user's minimum version is below the minimum set in the pyproject.
I'll test on my system as well and in any case can just put it inside of a
if(win32)
block or similarFix usage of F16C intrinsics in AVX code (abetlen#563)