Ggerganov #5

YellowRoseCx · 2023-04-16T05:58:38Z

No description provided.

* ggml : add Q8_0 quantization for intermediate results * quantize-stats : fix test + add it to Makefile default * Q8: use int8_t, AVX/AVX2 optimizations * ggml : fix quantize_row_q8_0() ARM_NEON rounding * minor : updates after rebase to latest master * quantize-stats : delete obsolete strings * ggml : fix q4_1 dot func --------- Co-authored-by: Stephan Walter <[email protected]>

…uins#991) Calling `mmap.mmap` on Windows apparently resets the file offset of the raw file object (and makes the BufferedReader return a *negative* file offset). For safetensors, avoid using the file offset after calling mmap. For GGML format, explicitly save and restore the offset. Fixes LostRuins#966.

ggerganov master to upstreamchanges

…#5) * initialize rocblas

* use hipblas based on cublas * Update Makefile for the Cuda kernels * Expand arch list and make it overrideable * Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5) * add hipBLAS to README * new build arg LLAMA_CUDA_MMQ_Y * fix half2 decomposition * Add intrinsics polyfills for AMD * AMD assembly optimized __dp4a * Allow overriding CC_TURING * use "ROCm" instead of "CUDA" * ignore all build dirs * Add Dockerfiles * fix llama-bench * fix -nommq help for non CUDA/HIP --------- Co-Authored-By: YellowRoseCx <[email protected]> Co-Authored-By: ardfork <[email protected]> Co-Authored-By: funnbot <[email protected]> Co-Authored-By: Engininja2 <[email protected]> Co-Authored-By: Kerfuffle <[email protected]> Co-Authored-By: jammm <[email protected]> Co-Authored-By: jdecourval <[email protected]>

commit 3416c986d9d9a31c3cdefd7e7bd4d9438d72ba35 Merge: 5eb17f0 4c4e435 Author: YellowRoseCx <[email protected]> Date: Fri Aug 25 13:46:56 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit 5eb17f02c8638e003bb91bddf95ccf54d2ad0c12 Author: YellowRoseCx <[email protected]> Date: Fri Aug 25 13:38:21 2023 -0500 ROCm Port update * use hipblas based on cublas * Update Makefile for the Cuda kernels * Expand arch list and make it overrideable * Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5) * add hipBLAS to README * new build arg LLAMA_CUDA_MMQ_Y * fix half2 decomposition * Add intrinsics polyfills for AMD * AMD assembly optimized __dp4a * Allow overriding CC_TURING * use "ROCm" instead of "CUDA" * ignore all build dirs * Add Dockerfiles * fix llama-bench * fix -nommq help for non CUDA/HIP --------- Co-Authored-By: YellowRoseCx <[email protected]> Co-Authored-By: ardfork <[email protected]> Co-Authored-By: funnbot <[email protected]> Co-Authored-By: Engininja2 <[email protected]> Co-Authored-By: Kerfuffle <[email protected]> Co-Authored-By: jammm <[email protected]> Co-Authored-By: jdecourval <[email protected]> commit 4c4e4358ed54c397d3f0f5bc268f1ac59d909f57 Author: Concedo <[email protected]> Date: Thu Aug 24 22:12:56 2023 +0800 fixed linux build error commit 661bede62fe216632d099678a9dac08de7a68a4e Author: Concedo <[email protected]> Date: Thu Aug 24 21:16:16 2023 +0800 optimize tokenize method commit b95a4ccb228ebfac12e5ce4b445f073ca67b99d2 Author: Concedo <[email protected]> Date: Thu Aug 24 20:41:49 2023 +0800 added a token counting endpoint, set mmq as default commit 81a0ef342ce1e583f6a5b060252565dbd59e1d8d Author: Concedo <[email protected]> Date: Thu Aug 24 16:26:38 2023 +0800 updated lite, switched to unminified source commit 598d4d89ab3aaa539ddf36784306071f1411814a Author: Concedo <[email protected]> Date: Thu Aug 24 15:45:33 2023 +0800 fix for config file loading. from kcpp settings file commit a3b994962673e681aafd9503781c7470acdcc63f Merge: b8372d4 2d86b2e Author: Concedo <[email protected]> Date: Thu Aug 24 15:22:17 2023 +0800 Merge remote-tracking branch 'pop/add_config_arg' into concedo_experimental commit b8372d44666531f5d17cbe264912fbe5548fd54b Merge: 8263fd7 6e91a1b Author: Concedo <[email protected]> Date: Thu Aug 24 15:21:24 2023 +0800 Merge branch 'master' into concedo_experimental # Conflicts: # .gitignore # README.md # tests/CMakeLists.txt commit 6e91a1b0706c2e0e52b9d9be7ee82d3c1e7a33c1 Author: Evan Jones <[email protected]> Date: Thu Aug 24 00:07:13 2023 -0400 llama : fix grammar sometimes generating null char (#2756) commit 44d5462b5cddc1c5cbcd7647646f7b55b175b01f Author: Georgi Gerganov <[email protected]> Date: Wed Aug 23 23:44:19 2023 +0300 readme : fix link commit c7868b075377c8c3fa916ea7c1aca600f44bed55 Author: Georgi Gerganov <[email protected]> Date: Wed Aug 23 23:43:00 2023 +0300 minor : fix trailing whitespace commit 79da24b58c1ea72340e64f799a4717d372207676 Author: Georgi Gerganov <[email protected]> Date: Wed Aug 23 23:41:16 2023 +0300 readme : update hot topics commit cf658adc832badaaa2ca119fe86070e5a830f8f6 Author: Georgi Gerganov <[email protected]> Date: Wed Aug 23 23:08:04 2023 +0300 llm : add Falcon support (#2717) * llama : refactor GGUF constants into static maps * llama : check if model architecture is known * llama : refactor llama_model_load_internal() * gguf : add KV constant maps * llm : read arch-specific KVs * convert : add dummy scores + types * falcon : load tensor data (CPU only) * llama : fix loading progress bar * llama : add arch member to llama_model * falcon : CPU inference working * falcon : support non-40B models * falcon : minor * llama : minor updates ggml-ci * convert-falcon-hf-to-gguf.py : fix special token mapping * llama.cpp : llama default UNK token = id 0 * llama.cpp : fix bpe tokenizer * llama.cpp : fix the fix of bpe tokenizer * ggml : pass eps to ggml_norm * metal : implement RoPE (mode = 2) + avoid ggml_repeat * ggml : ggml_repeat always creates new tensor * falcon : copy-paste self-attention from LLaMA * metal : print extra compute pipeline info * falcon : minor changes (still chasing the Metal problem) * llama.cpp : fix linefeed token * metal : fix GELU kernel numerical stability by using precise::tanh * metal : temporary workaround for the concurrency optimization bug * falcon : add CUDA offloading (#2739) * llama : better model naming and size reporting * llama : prep new tokenizer support * llama : advanced BPE tokenizer based on ggllm.cpp imlpementation * llama : remove oboslete comment ggml-ci * common : remove obsolete BPE API + disable test-tokenizer-1 * llama : revert BPE special-case in llama_byte_to_token() * cuda : add TODOs for RoPE NeoX implementation * llama : default special tokens based on vocab type * perplexity : add log for start of tokenization --------- Co-authored-by: klosax <[email protected]> Co-authored-by: slaren <[email protected]> commit a192860cfec89a38d59a943623bf595b1fe4495b Author: Georgi Gerganov <[email protected]> Date: Wed Aug 23 22:37:39 2023 +0300 minor : fix trailing whitespace commit 95385241a91a616788a3bb76d12c9b7b2379ca2d Author: Olivier Chafik <[email protected]> Date: Wed Aug 23 20:33:05 2023 +0100 examples : restore the functionality to import llama2.c models (#2685) * Fix import of llama2.c models that don't share weights between embedding layers * llama2c: reinstate ggmlv3 conversion output + update readme w/ gguf conv * llama2.c: comment out legacy "load from ggml model" logic * llama2.c: convert special-cased "<0xXX>" single byte tokens from tokenizer.bin commit 335acd2ffd7b04501c6d8773ab9fcee6e7bf8639 Author: slaren <[email protected]> Date: Wed Aug 23 16:46:54 2023 +0200 fix convert-lora-to-ggml.py (#2738) commit 5290c38e6e9b66ee2b543e560e301c1a1a90929c Author: klosax <[email protected]> Date: Wed Aug 23 16:46:03 2023 +0200 main : insert bos if no tokens (#2727) * main.cpp : insert bos if no tokens * Update examples/main/main.cpp * Update examples/main/main.cpp --------- Co-authored-by: Georgi Gerganov <[email protected]> commit cc34dbda9681418a2b18382446b90cdcec398d82 Author: akawrykow <[email protected]> Date: Wed Aug 23 07:31:34 2023 -0700 gitignore : fix for windows (#2729) commit 7c2227a1972a4add4b5c118e4914c086513d0382 Author: Cebtenzzre <[email protected]> Date: Wed Aug 23 10:29:09 2023 -0400 chmod : make scripts executable (#2675) commit f19dca04ea5fbf9a0b2753091d93464585d5c73b Author: JohnnyB <[email protected]> Date: Wed Aug 23 15:28:22 2023 +0100 devops : RPM Specs (#2723) * Create llama-cpp.srpm * Rename llama-cpp.srpm to llama-cpp.srpm.spec Correcting extension. * Tested spec success. * Update llama-cpp.srpm.spec * Create lamma-cpp-cublas.srpm.spec * Create lamma-cpp-clblast.srpm.spec * Update lamma-cpp-cublas.srpm.spec Added BuildRequires * Moved to devops dir commit 8263fd7bdb247f2c3ff21debb50b22bd9b030339 Author: askmyteapot <[email protected]> Date: Thu Aug 24 00:15:48 2023 +1000 Update llama_v3.cpp (#393) Fixing C2065 compiler error. Missed '3' on 3 separate identifiers (kB > kB3, MB > MB3) commit bfdc596d58fbd9bbadd2352705af4373005e1411 Author: Concedo <[email protected]> Date: Wed Aug 23 19:19:52 2023 +0800 gguf reader in file format detection commit 8207214b6a37a46526cee9e72d4c9092b9d1872f Author: Kawrakow <[email protected]> Date: Wed Aug 23 12:57:12 2023 +0300 Fix values shown in the quantize tool help (#2735) Co-authored-by: Iwan Kawrakow <[email protected]> commit 62959e740e8759d246ac8d09036950efde09981c Author: Kawrakow <[email protected]> Date: Wed Aug 23 12:56:42 2023 +0300 Strided perplexity (#2714) * Implementing strided computation of perplexity * Alternative way to output PPL results --------- Co-authored-by: Iwan Kawrakow <[email protected]> commit 7f7ddd5002040804e33fcdbde44aa22f8635f57d Author: IgnacioFDM <[email protected]> Date: Wed Aug 23 06:31:09 2023 -0300 Fix ggml to gguf conversion on Windows (#2733) This fixes `RuntimeWarning: overflow encountered in long_scalars` Credit: anon (not mine) commit af170fc2db1186d3002b602d909c52c22de4a076 Merge: 981c913 b8ad1b6 Author: Concedo <[email protected]> Date: Wed Aug 23 17:08:09 2023 +0800 Merge branch 'master' into concedo_experimental # Conflicts: # README.md # llama.cpp # scripts/sync-ggml.sh # tests/test-tokenizer-0.cpp commit 981c9131f0f20c10099735c1e353534b5bfe1e59 Author: Concedo <[email protected]> Date: Wed Aug 23 16:07:07 2023 +0800 gguf for llama is working commit b8ad1b66b23f9b2e6e4531e9a62753323036a556 Author: Xiao-Yong Jin <[email protected]> Date: Wed Aug 23 02:12:12 2023 -0500 server : allow json array in prompt or content for direct token input (#2306) * server: allow json array in prompt or content We accept an array of strings and numbers representing tokens, in addition to the current string valued prompt or content. This allows direct token input, so that any special tokens can be processed and used at the frontend during the construction of the json data, before sending to the server. And the server does not need to know or parse special tokens from textual input. With this, we can use EOS and BOS used in llama-2-chat models. * server: use tokenizePrompt(json) and default "" if empty prompt * server: fix prompt check * server: tokenize endpoint no longer adds BOS commit f5fe98d11bdf9e7797bcfb05c0c3601ffc4b9d26 Author: Evan Jones <[email protected]> Date: Tue Aug 22 21:01:57 2023 -0400 docs : add grammar docs (#2701) * docs : add grammar docs * tweaks to grammar guide * rework GBNF example to be a commented grammar commit 777f42ba18b29f25c71ff8de3ecf97b8017304c0 Author: Kerfuffle <[email protected]> Date: Tue Aug 22 17:39:39 2023 -0600 Improve handling of special tokens in GGML to GGUF converter (#2725) * Improve UNK, BOS, EOS token handling when converting without metadata. * Allow importing as a module. * Remove some obsolete code and minor cleanups. * Set default UNK token mapping from -1 to 0 in llama.cpp * Try to handle overflow due to buggy Windows Python with a better error message commit 46ef5b5fcf4c366e1fb27726b6394adbbf8fd0ea Author: goerch <[email protected]> Date: Tue Aug 22 23:10:42 2023 +0200 llama : fix whitespace escaping in tokenizer (#2724) commit c63bb1d16a70c03440671b76954bb767513cead8 Author: Johannes Gäßler <[email protected]> Date: Tue Aug 22 22:47:05 2023 +0200 CUDA: use mul_mat_q kernels by default (#2683) commit 3b6cfe7c927df178ca3c11643c3ec93e143471c9 Author: Alex Petenchea <[email protected]> Date: Tue Aug 22 21:58:16 2023 +0300 convert.py : clarifying error message (#2718) commit 800c9635b4a9390126f397870f3a825fc7455bd1 Author: Jiahao Li <[email protected]> Date: Wed Aug 23 02:27:06 2023 +0800 Fix CUDA softmax by subtracting max value before exp (#2665) commit deb7dfca4b9725cd295d1426db75fe8e0a6d5312 Author: Georgi Gerganov <[email protected]> Date: Tue Aug 22 20:05:59 2023 +0300 gguf : add ftype meta info to the model (#2710) * llama : add ftype meta info to the model ggml-ci * convert.py : add ftype when converting (does not work) * convert.py : fix Enum to IntEnum ggml-ci commit bac66994cf356cf488078c056831396eb4ce31d5 Author: Kawrakow <[email protected]> Date: Tue Aug 22 19:14:09 2023 +0300 Quantization imrovements for k_quants (#2707) * Improve LLaMA-2 2-, 3- and 4-bit quantization * Q3_K_S: use Q5_K for 1st 2 layers of attention.wv and feed_forward.w2 * Q4_K_S: use Q6_K for 1st 2 layers of attention.wv and feed_forward.w2 * Q2_K and Q3_K_M: use Q5_K instead of Q4_K for 1st 2 layers of attention.wv and feed_forward.w2 This leads to a slight model sized increase as follows: Q2_K : 2.684G vs 2.670G Q3_K_S: 2.775G vs 2.745G Q3_K_M: 3.071G vs 3.057G Q4_K_S: 3.592G vs 3.563G LLaMA-2 PPL for context 512 changes as follows: Q2_K : 6.6691 vs 6.8201 Q3_K_S: 6.2129 vs 6.2584 Q3_K_M: 6.0387 vs 6.1371 Q4_K_S: 5.9138 vs 6.0041 There are improvements for LLaMA-1 as well, but they are way smaller than the above. * Minor 4-bit quantization improvement For the same model size as previus commit, we get PPL = 5.9069 vs 5.9138. * Some more fine tuning * Adding make_qkx2_quants With it, we get PPL = 5.8828 for L2-7B Q4_K_S. * Another minor improvement * Q2_K improvement Smaller model, lower perplexity. 7B: file size = 2.632G, PPL = 6.3772 vs original 2.670G PPL = 6.8201 12B: file size = 5.056G, PPL = 5.4577 vs original 5.130G PPL = 5.7178 It is mostly Q3_K except for tok_embeddings, attention.wq, attention.wk, which are Q2_K * Iterating * Revert Q5_K back to make_qkx1_quants * Better Q6_K * make_qkx2_quants is better for Q5_K after all * Fix after rebasing on master * Fix for changed tensor names --------- Co-authored-by: Iwan Kawrakow <[email protected]> commit 39cc83e8c9fafe1494c4996b07f97afed29c9f27 Merge: 2d17c22 6381d4e Author: Concedo <[email protected]> Date: Tue Aug 22 23:12:47 2023 +0800 incomplete merge, compiles but generates rubbish commit 519c981f8b65ee6c87c2965539685ced0a17223b Author: slaren <[email protected]> Date: Tue Aug 22 16:03:12 2023 +0200 embedding : evaluate prompt in batches (#2713) commit 1123f7fbdfb8012e46f05e903e6f675922916378 Author: slaren <[email protected]> Date: Tue Aug 22 15:25:19 2023 +0200 ggml-cuda : use graph allocator (#2684) use a different function for no_alloc to avoid breaking backwards compat, fixes lora remove 512 n_batch limit fixed 2048 batch size cleanup Co-authored-by: Johannes Gäßler <[email protected]> commit ef3f333d3775600d1646a9fa249aca532d15fb89 Author: Georgi Gerganov <[email protected]> Date: Tue Aug 22 14:22:08 2023 +0300 ggml : sync latest (SAM + SD operators, CUDA alibi) (#2709) * ggml : sync latest (SAM + SD operators, CUDA alibi) ggml-ci * ggml : fix tabs commit 2d17c224376c0fb2d6cfce8726de5a5f7b666bfe Merge: 36b0c5b dadbed9 Author: Concedo <[email protected]> Date: Tue Aug 22 18:20:06 2023 +0800 functional commit before gguf merge commit 8e4364f2af9cd5d57240f23e83c0e29bc068bc02 Author: slaren <[email protected]> Date: Tue Aug 22 09:56:03 2023 +0200 llama-bench : minor fixes (#2695) commit 1e3bc523d8053a77df3ac7126a84d0297ee97ef6 Author: Kylin <[email protected]> Date: Tue Aug 22 15:14:23 2023 +0800 ggml : support CUDA's half type for aarch64(#1455) (#2670) * ggml: support CUDA's half type for aarch64(#1455) support CUDA's half type for aarch64 in ggml_fp16_t definition * ggml: use __CUDACC__ to recognise nvcc compiler commit 14b1d7e6f720dee41ce5a826376df738096d9033 Author: Shouzheng Liu <[email protected]> Date: Tue Aug 22 02:18:40 2023 -0400 metal : add missing barriers for mul-mat (#2699) commit 226255b44ef2c2794bfac48d101d35a9c2dbb965 Author: Jhen-Jie Hong <[email protected]> Date: Tue Aug 22 08:32:00 2023 +0800 server : fallback to default if client param is null (#2688) * server : fallback to default if client param is null * server : do not overwrite 404 if status is 500 from exception_handler commit 930523c8e1cbbee5449c055daa894917fac6805e Author: Kerfuffle <[email protected]> Date: Mon Aug 21 18:01:34 2023 -0600 Fix convert-llama-ggmlv3-to-gguf.py vocab conversion (#2698) When converting without metadata, the hex value for bytes entries weren't 0 padded to 2 digits. commit 2d86b2e219ef988878bdea7e33a534aad3a744da Author: Pontus Mårdnäs <[email protected]> Date: Mon Aug 21 23:46:56 2023 +0200 Add --config argument commit c8dba409e6d6a754090f08e6a862c5ffdd52e421 Author: Georgi Gerganov <[email protected]> Date: Mon Aug 21 23:40:22 2023 +0300 py : remove obsolete script commit 6381d4e110bd0ec02843a60bbeb8b6fc37a9ace9 Author: Georgi Gerganov <[email protected]> Date: Mon Aug 21 23:07:43 2023 +0300 gguf : new file format with flexible meta data (beta) (#2398) * gguf : first API pass * gguf : read header + meta data * gguf : read tensor info * gguf : initial model loading - not tested * gguf : add gguf_get_tensor_name() * gguf : do not support passing existing ggml_context to gguf_init * gguf : simplify gguf_get_val * gguf : gguf.c is now part of ggml.c * gguf : read / write sample models * gguf : add comments * refactor : reduce code duplication and better API (#2415) * gguf : expose the gguf_type enum through the API for now * gguf : add array support * gguf.py : some code style changes * convert.py : start a new simplified implementation by removing old stuff * convert.py : remove GGML vocab + other obsolete stuff * GGUF : write tensor (#2426) * WIP: Write tensor * GGUF : Support writing tensors in Python * refactor : rm unused import and upd todos * fix : fix errors upd writing example * rm example.gguf * gitignore *.gguf * undo formatting * gguf : add gguf_find_key (#2438) * gguf.cpp : find key example * ggml.h : add gguf_find_key * ggml.c : add gguf_find_key * gguf : fix writing tensors * gguf : do not hardcode tensor names to read * gguf : write sample tensors to read * gguf : add tokenization constants * quick and dirty conversion example * gguf : fix writing gguf arrays * gguf : write tensors one by one and code reuse * gguf : fix writing gguf arrays * gguf : write tensors one by one * gguf : write tensors one by one * gguf : write tokenizer data * gguf : upd gguf conversion script * Update convert-llama-h5-to-gguf.py * gguf : handle already encoded string * ggml.h : get array str and f32 * ggml.c : get arr str and f32 * gguf.py : support any type * Update convert-llama-h5-to-gguf.py * gguf : fix set is not subscriptable * gguf : update convert-llama-h5-to-gguf.py * constants.py : add layer norm eps * gguf.py : add layer norm eps and merges * ggml.h : increase GGML_MAX_NAME to 64 * ggml.c : add gguf_get_arr_n * Update convert-llama-h5-to-gguf.py * add gptneox gguf example * Makefile : add gptneox gguf example * Update convert-llama-h5-to-gguf.py * add gptneox gguf example * Update convert-llama-h5-to-gguf.py * Update convert-gptneox-h5-to-gguf.py * Update convert-gptneox-h5-to-gguf.py * Update convert-llama-h5-to-gguf.py * gguf : support custom alignment value * gguf : fix typo in function call * gguf : mmap tensor data example * fix : update convert-llama-h5-to-gguf.py * Update convert-llama-h5-to-gguf.py * convert-gptneox-h5-to-gguf.py : Special tokens * gptneox-main.cpp : special tokens * Update gptneox-main.cpp * constants.py : special tokens * gguf.py : accumulate kv and tensor info data + special tokens * convert-gptneox-h5-to-gguf.py : accumulate kv and ti + special tokens * gguf : gguf counterpart of llama-util.h * gguf-util.h : update note * convert-llama-h5-to-gguf.py : accumulate kv / ti + special tokens * convert-llama-h5-to-gguf.py : special tokens * Delete gptneox-common.cpp * Delete gptneox-common.h * convert-gptneox-h5-to-gguf.py : gpt2bpe tokenizer * gptneox-main.cpp : gpt2 bpe tokenizer * gpt2 bpe tokenizer (handles merges and unicode) * Makefile : remove gptneox-common * gguf.py : bytesarray for gpt2bpe tokenizer * cmpnct_gpt2bpe.hpp : comments * gguf.py : use custom alignment if present * gguf : minor stuff * Update gptneox-main.cpp * map tensor names * convert-gptneox-h5-to-gguf.py : map tensor names * convert-llama-h5-to-gguf.py : map tensor names * gptneox-main.cpp : map tensor names * gguf : start implementing libllama in GGUF (WIP) * gguf : start implementing libllama in GGUF (WIP) * rm binary commited by mistake * upd .gitignore * gguf : calculate n_mult * gguf : inference with 7B model working (WIP) * gguf : rm deprecated function * gguf : start implementing gguf_file_saver (WIP) * gguf : start implementing gguf_file_saver (WIP) * gguf : start implementing gguf_file_saver (WIP) * gguf : add gguf_get_kv_type * gguf : add gguf_get_kv_type * gguf : write metadata in gguf_file_saver (WIP) * gguf : write metadata in gguf_file_saver (WIP) * gguf : write metadata in gguf_file_saver * gguf : rm references to old file formats * gguf : shorter name for member variable * gguf : rm redundant method * gguf : get rid of n_mult, read n_ff from file * Update gguf_tensor_map.py * Update gptneox-main.cpp * gguf : rm references to old file magics * gguf : start implementing quantization (WIP) * gguf : start implementing quantization (WIP) * gguf : start implementing quantization (WIP) * gguf : start implementing quantization (WIP) * gguf : start implementing quantization (WIP) * gguf : start implementing quantization (WIP) * gguf : quantization is working * gguf : roper closing of file * gguf.py : no need to convert tensors twice * convert-gptneox-h5-to-gguf.py : no need to convert tensors twice * convert-llama-h5-to-gguf.py : no need to convert tensors twice * convert-gptneox-h5-to-gguf.py : simplify nbytes * convert-llama-h5-to-gguf.py : simplify nbytes * gptneox-main.cpp : n_layer --> n_block * constants.py : n_layer --> n_block * gguf.py : n_layer --> n_block * convert-gptneox-h5-to-gguf.py : n_layer --> n_block * convert-llama-h5-to-gguf.py : n_layer --> n_block * gptneox-main.cpp : n_layer --> n_block * Update gguf_tensor_map.py * convert-gptneox-h5-to-gguf.py : load model in parts to save memory * convert-llama-h5-to-gguf.py : load model in parts to save memory * convert : write more metadata for LLaMA * convert : rm quantization version * convert-gptneox-h5-to-gguf.py : add file_type key * gptneox-main.cpp : add file_type key * fix conflicts * gguf : add todos and comments * convert-gptneox-h5-to-gguf.py : tensor name map changes * Create gguf_namemap.py : tensor name map changes * Delete gguf_tensor_map.py * gptneox-main.cpp : tensor name map changes * convert-llama-h5-to-gguf.py : fixes * gguf.py : dont add empty strings * simple : minor style changes * gguf : use UNIX line ending * Create convert-llama-7b-pth-to-gguf.py * llama : sync gguf-llama.cpp with latest llama.cpp (#2608) * llama : sync gguf-llama.cpp with latest llama.cpp * minor : indentation + assert * llama : refactor gguf_buffer and gguf_ctx_buffer * llama : minor * gitignore : add gptneox-main * llama : tokenizer fixes (#2549) * Merge tokenizer fixes into the gguf branch. * Add test vocabularies * convert : update convert-new.py with tokenizer fixes (#2614) * Merge tokenizer fixes into the gguf branch. * Add test vocabularies * Adapt convert-new.py (and fix a clang-cl compiler error on windows) * llama : sync gguf-llama with llama (#2613) * llama : sync gguf-llama with llama * tests : fix build + warnings (test-tokenizer-1 still fails) * tests : fix wstring_convert * convert : fix layer names * llama : sync gguf-llama.cpp * convert : update HF converter to new tokenizer voodoo magics * llama : update tokenizer style * convert-llama-h5-to-gguf.py : add token types * constants.py : add token types * gguf.py : add token types * convert-llama-7b-pth-to-gguf.py : add token types * gguf-llama.cpp : fix n_head_kv * convert-llama-h5-to-gguf.py : add 70b gqa support * gguf.py : add tensor data layout * convert-llama-h5-to-gguf.py : add tensor data layout * convert-llama-7b-pth-to-gguf.py : add tensor data layout * gptneox-main.cpp : add tensor data layout * convert-llama-h5-to-gguf.py : clarify the reverse permute * llama : refactor model loading code (#2620) * llama : style formatting + remove helper methods * llama : fix quantization using gguf tool * llama : simplify gguf_file_saver * llama : fix method names * llama : simplify write_header() * llama : no need to pass full file loader to the file saver just gguf_ctx * llama : gguf_file_saver write I32 * llama : refactor tensor names (#2622) * gguf: update tensor names searched in quantization * gguf : define tensor names as constants * gguf : initial write API (not tested yet) * gguf : write to file API (not tested) * gguf : initial write API ready + example * gguf : fix header write * gguf : fixes + simplify example + add ggml_nbytes_pad() * gguf : minor * llama : replace gguf_file_saver with new gguf write API * gguf : streaming support when writing files * gguf : remove oboslete write methods * gguf : remove obosolete gguf_get_arr_xxx API * llama : simplify gguf_file_loader * llama : move hparams and vocab from gguf_file_loader to llama_model_loader * llama : merge gguf-util.h in llama.cpp * llama : reorder definitions in .cpp to match .h * llama : minor simplifications * llama : refactor llama_model_loader (WIP) wip : remove ggml_ctx from llama_model_loader wip : merge gguf_file_loader in llama_model_loader * llama : fix shape prints * llama : fix Windows build + fix norm_rms_eps key * llama : throw error on missing KV paris in model meta data * llama : improve printing + log meta data * llama : switch print order of meta data --------- Co-authored-by: M. Yusuf Sarıgöz <[email protected]> * gguf : deduplicate (#2629) * gguf : better type names * dedup : CPU + Metal is working * ggml : fix warnings about unused results * llama.cpp : fix line feed and compiler warning * llama : fix strncpy warning + note token_to_str does not write null * llama : restore the original load/save session implementation Will migrate this to GGUF in the future * convert-llama-h5-to-gguf.py : support alt ctx param name * ggml : assert when using ggml_mul with non-F32 src1 * examples : dedup simple --------- Co-authored-by: klosax <[email protected]> * gguf.py : merge all files in gguf.py * convert-new.py : pick #2427 for HF 70B support * examples/gguf : no need to keep q option for quantization any more * llama.cpp : print actual model size * llama.cpp : use ggml_elements() * convert-new.py : output gguf (#2635) * convert-new.py : output gguf (WIP) * convert-new.py : add gguf key-value pairs * llama : add hparams.ctx_train + no longer print ftype * convert-new.py : minor fixes * convert-new.py : vocab-only option should work now * llama : fix tokenizer to use llama_char_to_byte * tests : add new ggml-vocab-llama.gguf * convert-new.py : tensor name mapping * convert-new.py : add map for skipping tensor serialization * convert-new.py : convert script now works * gguf.py : pick some of the refactoring from #2644 * convert-new.py : minor fixes * convert.py : update to support GGUF output * Revert "ci : disable CI temporary to not waste energy" This reverts commit 7e82d25f40386540c2c15226300ad998ecd871ea. * convert.py : n_head_kv optional and .gguf file extension * convert.py : better always have n_head_kv and default it to n_head * llama : sync with recent PRs on master * editorconfig : ignore models folder ggml-ci * ci : update ".bin" to ".gguf" extension ggml-ci * llama : fix llama_model_loader memory leak * gptneox : move as a WIP example * llama : fix lambda capture ggml-ci * ggml : fix bug in gguf_set_kv ggml-ci * common.h : .bin --> .gguf * quantize-stats.cpp : .bin --> .gguf * convert.py : fix HF tensor permuting / unpacking ggml-ci * llama.cpp : typo * llama : throw error if gguf fails to init from file ggml-ci * llama : fix tensor name grepping during quantization ggml-ci * gguf.py : write tensors in a single pass (#2644) * gguf : single pass for writing tensors + refactoring writer * gguf : single pass for writing tensors + refactoring writer * gguf : single pass for writing tensors + refactoring writer * gguf : style fixes in simple conversion script * gguf : refactor gptneox conversion script * gguf : rename h5 to hf (for HuggingFace) * gguf : refactor pth to gguf conversion script * gguf : rm file_type key and method * gguf.py : fix vertical alignment * gguf.py : indentation --------- Co-authored-by: Georgi Gerganov <[email protected]> * convert-gptneox-hf-to-gguf.py : fixes * gguf.py : gptneox mapping * convert-llama-hf-to-gguf.py : fixes * convert-llama-7b-pth-to-gguf.py : fixes * ggml.h : reverse GGUF_MAGIC * gguf.py : reverse GGUF_MAGIC * test-tokenizer-0.cpp : fix warning * llama.cpp : print kv general.name * llama.cpp : get special token kv and linefeed token id * llama : print number of tensors per type + print arch + style * tests : update vocab file with new magic * editorconfig : fix whitespaces * llama : re-order functions * llama : remove C++ API + reorganize common source in /common dir * llama : minor API updates * llama : avoid hardcoded special tokens * llama : fix MPI build ggml-ci * llama : introduce enum llama_vocab_type + remove hardcoded string constants * convert-falcon-hf-to-gguf.py : falcon HF --> gguf conversion, not tested * falcon-main.cpp : falcon inference example * convert-falcon-hf-to-gguf.py : remove extra kv * convert-gptneox-hf-to-gguf.py : remove extra kv * convert-llama-7b-pth-to-gguf.py : remove extra kv * convert-llama-hf-to-gguf.py : remove extra kv * gguf.py : fix for falcon 40b * falcon-main.cpp : fix for falcon 40b * convert-falcon-hf-to-gguf.py : update ref * convert-falcon-hf-to-gguf.py : add tensor data layout * cmpnct_gpt2bpe.hpp : fixes * falcon-main.cpp : fixes * gptneox-main.cpp : fixes * cmpnct_gpt2bpe.hpp : remove non-general stuff * Update examples/server/README.md Co-authored-by: slaren <[email protected]> * cmpnct_gpt2bpe.hpp : cleanup * convert-llama-hf-to-gguf.py : special tokens * convert-llama-7b-pth-to-gguf.py : special tokens * convert-permute-debug.py : permute debug print * convert-permute-debug-master.py : permute debug for master * convert-permute-debug.py : change permute type of attn_q * convert.py : 70b model working (change attn_q permute) * Delete convert-permute-debug-master.py * Delete convert-permute-debug.py * convert-llama-hf-to-gguf.py : fix attn_q permute * gguf.py : fix rope scale kv * convert-llama-hf-to-gguf.py : rope scale and added tokens * convert-llama-7b-pth-to-gguf.py : rope scale and added tokens * llama.cpp : use rope scale kv * convert-llama-7b-pth-to-gguf.py : rope scale fix * convert-llama-hf-to-gguf.py : rope scale fix * py : fix whitespace * gguf : add Python script to convert GGMLv3 LLaMA models to GGUF (#2682) * First pass at converting GGMLv3 LLaMA models to GGUF * Cleanups, better output during conversion * Fix vocab space conversion logic * More vocab conversion fixes * Add description to converted GGUF files * Improve help text, expand warning * Allow specifying name and description for output GGUF * Allow overriding vocab and hyperparams from original model metadata * Use correct params override var name * Fix wrong type size for Q8_K Better handling of original style metadata * Set default value for gguf add_tensor raw_shape KW arg * llama : improve token type support (#2668) * Merge tokenizer fixes into the gguf branch. * Add test vocabularies * Adapt convert-new.py (and fix a clang-cl compiler error on windows) * Improved tokenizer test But does it work on MacOS? * Improve token type support - Added @klosax code to convert.py - Improved token type support in vocabulary * Exclude platform dependent tests * More sentencepiece compatibility by eliminating magic numbers * Restored accidentally removed comment * llama : add API for token type ggml-ci * tests : use new tokenizer type API (#2692) * Merge tokenizer fixes into the gguf branch. * Add test vocabularies * Adapt convert-new.py (and fix a clang-cl compiler error on windows) * Improved tokenizer test But does it work on MacOS? * Improve token type support - Added @klosax code to convert.py - Improved token type support in vocabulary * Exclude platform dependent tests * More sentencepiece compatibility by eliminating magic numbers * Restored accidentally removed comment * Improve commentary * Use token type API in test-tokenizer-1.cpp * py : cosmetics * readme : add notice about new file format ggml-ci --------- Co-authored-by: M. Yusuf Sarıgöz <[email protected]> Co-authored-by: klosax <[email protected]> Co-authored-by: goerch <[email protected]> Co-authored-by: slaren <[email protected]> Co-authored-by: Kerfuffle <[email protected]> commit dadbed99e65252d79f81101a392d0d6497b86caa Author: Shouzheng Liu <[email protected]> Date: Mon Aug 21 06:59:29 2023 -0400 metal : fix synchronization in new matrix multiplication kernel (#2686) commit cb1c0727bd59803b439b6a3af121c99e6393ff3d Author: Kawrakow <[email protected]> Date: Mon Aug 21 11:11:31 2023 +0300 HellaSwag: split token evaluation into batches if needed (#2681) Co-authored-by: Iwan Kawrakow <[email protected]> commit 9e232f0234073358e7031c1b8d7aa45020469a3b Author: slaren <[email protected]> Date: Sun Aug 20 22:17:53 2023 +0200 ggml : move all type info to ggml_type_traits (#2663) commit 5e9ff54a675d163d9f42aad1b5b3e734f17b2701 Author: Kawrakow <[email protected]> Date: Sun Aug 20 16:44:46 2023 +0300 More efficient Hellaswag implementation (#2677) Co-authored-by: Iwan Kawrakow <[email protected]> commit b34f4bd2724733e188ec4f6074042f66a5ed28c9 Author: YellowRoseCx <[email protected]> Date: Sat Aug 19 17:12:52 2023 -0500 Update README.md commit 1f0bccb27929e261744c979bc75114955da49e98 Author: Georgi Gerganov <[email protected]> Date: Sat Aug 19 00:45:36 2023 +0300 server : better default prompt (#2646) commit f63564adfaa157ca387071d6b9a06cfaef0ef576 Author: Jhen-Jie Hong <[email protected]> Date: Sat Aug 19 05:41:32 2023 +0800 server : update xxd usage for older versions compatibility (#2649) * server : update xxd usage for older versions compatibility * remove unused $func commit 2d8b76a110d76ff6b5728ff0af8477531e4db60e Author: Adrian <[email protected]> Date: Fri Aug 18 12:39:22 2023 -0700 Add link to clojure bindings to Readme. (#2659) commit 7af633aec339367e36c867ae709088d6a801aa75 Author: Georgi Gerganov <[email protected]> Date: Fri Aug 18 17:48:31 2023 +0300 readme : incoming BREAKING CHANGE commit 097e121e2f17ed3541cf02c55ff7e9febc091b19 Author: slaren <[email protected]> Date: Fri Aug 18 12:44:58 2023 +0200 llama : add benchmark example (#2626) * llama : add benchmark example * add to examples CMakeLists.txt * fix msvc build * add missing include * add Bessel's correction to stdev calculation Co-authored-by: Johannes Gäßler <[email protected]> * improve markdown formatting * add missing include * print warning is NDEBUG is not defined * remove n_prompt and n_gen from the matrix, use each value separately instead * better checks for non-optimized builds * llama.cpp : fix MEM_REQ_SCRATCH0 reusing the value of n_ctx of the first call * fix json formatting * add sql output * add basic cpu and gpu info (linx/cuda only) * markdown: also show values that differ from the default * markdown: add build id * cleanup * improve formatting * formatting --------- Co-authored-by: Johannes Gäßler <[email protected]> commit eaf98c2649d7da705de255712f0038ac7e47c610 Author: mdrokz <[email protected]> Date: Fri Aug 18 15:47:58 2023 +0530 readme : add link to Rust bindings (#2656) commit e9b12c332ec6e215fbac4b2ef165353acbcd8319 Author: Georgi Gerganov <[email protected]> Date: Fri Aug 18 12:48:55 2023 +0300 perplexity : more meaningful ETA number - 2 decimal points commit 604b8bdfa6320bbcb018eebcc1252dfede603c6b Author: Evan Jones <[email protected]> Date: Thu Aug 17 19:54:44 2023 -0400 Fix unicode in grammars (fixes #2501) (#2553) * Fix unicode in grammars (fixes #2501) * add more comments * fix test-llama-grammar commit 10151bee2e38b5711335c4a38f6ca93b50223acf Author: staviq <[email protected]> Date: Thu Aug 17 23:34:01 2023 +0000 server : support for saving templates in browser LocalStorage (#2486) * support for templates in browser LocalStorage * sync accepted #2409 fix from upstream * convert autosave invocation to useEffect * Apply suggestions from code review Co-authored-by: Jhen-Jie Hong <[email protected]> * Regen index.html.cpp, suggested from code review --------- Co-authored-by: Jhen-Jie Hong <[email protected]> commit 0992a7b8b18a89e29a205efb48ceb559c9a08203 Author: Johannes Gäßler <[email protected]> Date: Thu Aug 17 23:57:59 2023 +0200 README: fix LLAMA_CUDA_MMV_Y documentation (#2647) commit 6ddeefad9b634c5c79e6bcf046523493ff1fdf7d Author: Henri Vasserman <[email protected]> Date: Thu Aug 17 23:11:18 2023 +0300 [Zig] Fixing Zig build and improvements (#2554) * Fix zig after console.o was split * Better include and flag management * Change LTO to option commit 36b0c5b39816c039a5235733cfcd2b4e32386ff9 Author: Concedo <[email protected]> Date: Thu Aug 17 22:45:49 2023 +0800 fix for incorrect missing backends displayed commit 8dae7ce68437faf1fa96ec0e7687b8700956ef20 Author: Kerfuffle <[email protected]> Date: Thu Aug 17 07:29:44 2023 -0600 Add --cfg-negative-prompt-file option for examples (#2591) Add --cfg-negative-prompt-file option for examples commit a73ccf1aa34de49f61bfeb7f8a679c3bfdb3abe3 Author: Georgi Gerganov <[email protected]> Date: Thu Aug 17 10:47:09 2023 +0300 llama : replace (permute + reshape + view_1d) with (view_3d) (#2538) ggml-ci commit 7cf54e1f746941279d81d485796777c01f88049c Author: drbh <[email protected]> Date: Thu Aug 17 03:41:01 2023 -0400 tests : adds simple llama grammar tests (#2618) * adds simple llama grammar tests * fix lint and add Makefile * 0 terminate code_points * avoid dangling pointers in candidate cleanup * cleanup grammar at end of test commit a872a2b28eaefc8d464eaa535c94deeb501666f9 Author: Shouzheng Liu <[email protected]> Date: Thu Aug 17 03:35:53 2023 -0400 ggml-alloc : fix discrepency between measure&eval (#2639) The GGML memory allocator consistently places a tensor within the optimal-fit memory block, which is the smallest block capable of accommodating the tensor's size. During the measurement phase, the final block is generously sized, ensuring it never qualifies as the optimal-fit block as long as there exists another block capable of accommodating the tensor. Nevertheless, in the evaluation phase, the last block is constrained in size and could potentially qualify as the optimal-fit block. Consequently, there exists the possibility of a tensor being allocated to a different region during evaluation, leading to more memory fragmentation in our scratch buffer. This recent commit guarantees uniform behavior of the allocator across both the measurement and evaluation phases, eliminating discrepancies between the two. commit 0919a0f73d95cfb93a1646a1d1741a0615fe2c5e Author: Kolen Cheung <[email protected]> Date: Wed Aug 16 21:09:49 2023 +0100 cmake : install ggml-meta.metal if LLAMA_METAL (#2449) commit ed53db86c3b0e0815331a96d7a379edb5e62472c Author: Jhen-Jie Hong <[email protected]> Date: Thu Aug 17 04:09:03 2023 +0800 metal : print error of load pipeline state (#2564) * metal : print error of load pipeline state * metal : return null if load pipeline failed commit fc8ef549e50087762a0b4f901cd74b2defcc6ae3 Author: Shouzheng Liu <[email protected]> Date: Wed Aug 16 16:08:28 2023 -0400 metal : enable ggml-alloc (#2627) * metal: enable ggml-alloc Make ggml-alloc work with concurrently dispatch. * style-fix Co-authored-by: slaren <[email protected]> --------- Co-authored-by: slaren <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]> commit bf83bff6742c0f1795b4c18695a13a34ac7adf62 Author: Shouzheng Liu <[email protected]> Date: Wed Aug 16 16:07:04 2023 -0400 metal : matrix-matrix multiplication kernel (#2615) * metal: matrix-matrix multiplication kernel This commit removes MPS and uses custom matrix-matrix multiplication kernels for all quantization types. This commit also adds grouped-query attention to support llama2 70B. * metal: fix performance degradation from gqa Integers are slow on the GPU, and 64-bit divides are extremely slow. In the context of GQA, we introduce a 64-bit divide that cannot be optimized out by the compiler, which results in a decrease of ~8% in inference performance. This commit fixes that issue by calculating a part of the offset with a 32-bit divide. Naturally, this limits the size of a single matrix to ~4GB. However, this limitation should suffice for the near future. * metal: fix bugs for GQA and perplexity test. I mixed up ne02 and nb02 in previous commit. commit 075d079a72c741050a4c31a27530c8af19df70a6 Merge: 469d70b b5ffb28 Author: Concedo <[email protected]> Date: Wed Aug 16 10:43:06 2023 +0800 Merge branch 'master' into concedo_experimental # Conflicts: # CMakeLists.txt # Makefile # ggml-cuda.cu # llama-util.h # tests/CMakeLists.txt commit b5ffb2849d23afe73647f68eec7b68187af09be6 Author: Georgi Gerganov <[email protected]> Date: Tue Aug 15 10:04:58 2023 +0300 scripts : add helper script to get wikitext commit 469d70be45dfdac4d926c1326b579e88d0f0e040 Author: Concedo <[email protected]> Date: Tue Aug 15 13:49:05 2023 +0800 add support for precompiled binaries, used as a fallback commit 7d1196108ad330b32845546fb3472c2172a0b6b8 Author: YellowRoseCx <[email protected]> Date: Mon Aug 14 23:03:12 2023 -0500 remove force DMMV commit 3ebb00935f3f0522b75df49c2769ab1774b91380 Author: Jhen-Jie Hong <[email protected]> Date: Tue Aug 15 06:14:14 2023 +0800 server : add missing /json-schema-to-grammar.mjs (#2616) fixes #2611 commit d783f7982e0e823a2626a9956359c0d36c1a7e21 Author: Jhen-Jie Hong <[email protected]> Date: Mon Aug 14 21:37:39 2023 +0800 metal : return null instead of exit(1) (#2573) commit d75561df207d22790609ee0ad924302f66ac2599 Author: Cheng Shao <[email protected]> Date: Mon Aug 14 15:36:42 2023 +0200 server : add --numa support (#2524) commit 348acf188c9fbe66396990f2dc83229df367969b Author: Kamil Tomšík <[email protected]> Date: Mon Aug 14 15:35:16 2023 +0200 llama : add missing enum keyword in function signatures (#2610) commit 1cd06fa25eb859b14b3427a1d815a48f25fc3c34 Author: Johannes Gäßler <[email protected]> Date: Mon Aug 14 10:41:22 2023 +0200 CUDA: launch_bounds, small q4_K, q5_K mmq refactor (#2596) commit 2feb8934eb75ca63f3c42724229cce1df9579c8e Author: Jhen-Jie Hong <[email protected]> Date: Mon Aug 14 16:20:17 2023 +0800 server : fix default grammar by use empty string in the UI (#2604) commit 5517d6e69214cdead000a76983b9fe175c3f8329 Author: Jhen-Jie Hong <[email protected]> Date: Mon Aug 14 15:16:54 2023 +0800 server : implement json-schema-to-grammar.mjs & add grammar param in the UI (#2588) * server : implement json-schema-to-grammar.mjs by follow python impl * server : add grammar support in chat.mjs * server : implement grammer param in the UI * server : generate .hpp * server : remove trailing whitespaces * server : generate .hpp * server : fix sort of prop pairs * server : optimize regex & iteration commit f31b5397143009d682db90fd2a6cde83f1ef00eb Author: vxiiduu <[email protected]> Date: Mon Aug 14 13:59:16 2023 +1000 Enhance Windows 7 and below compatibility. (#2592) * Enhance Windows 7 compatibility. * Clean away unnecessary preprocessor conditional commit ee77efea2a1e3f7d153976b0934522b6bbaa62e6 Author: drbh <[email protected]> Date: Sun Aug 13 10:00:48 2023 -0400 test : add simple grammar parsing tests (#2594) * adds simple grammar parsing tests * adds cassert header commit f64d44a9b9581cd58f7ec40f4fa1c3ca5ca18e1e Author: Johannes Gäßler <[email protected]> Date: Sun Aug 13 00:24:45 2023 +0200 CUDA: Fixed OpenLLaMA 3b mmq, reduced compile time (#2590) commit cd61aa0d9e16627935c7978adf488a679ddfa745 Author: YellowRoseCx <[email protected]> Date: Sat Aug 12 17:24:31 2023 -0500 restore main_gpu parameter commit 4a042f326830271a4c31104051b7b08e08ac234e Author: Henri Vasserman <[email protected]> Date: Sat Aug 12 10:51:46 2023 +0300 gfx1100 support --------- Co-authored-by: ardfork <[email protected]> Co-authored-by: jammm <[email protected]> Co-authored-by: jdecourval <[email protected]> commit 8913bc6fea97d3cb860937b0461f455c6abe3ea1 Author: Henri Vasserman <[email protected]> Date: Fri Aug 11 10:16:02 2023 +0300 Allow overriding CC_TURING commit e77a4c37a756c002e97173f4122e088fb304e18a Author: Henri Vasserman <[email protected]> Date: Fri Aug 11 10:00:07 2023 +0300 Merge 'origin/master' into hipblas commit cc4c4e355cd553b1557d5fba2562e824db93f9b4 Author: Engininja2 <[email protected]> Date: Fri Aug 11 09:43:14 2023 +0300 New __dp4a assembly Now compatible with gfx900 and faster as well. commit 1a03b709848ce68d5bf5966237756167e2cac540 Author: Henri Vasserman <[email protected]> Date: Fri Aug 11 09:30:28 2023 +0300 Undo mess --------- Co-authored-by: ardfork <[email protected]> commit 4366ff9ba1b1f12e494118ef9b5198479022fcc5 Author: DannyDaemonic <[email protected]> Date: Thu Aug 10 13:11:36 2023 -0700 Handle `ENABLE_VIRTUAL_TERMINAL_PROCESSING` more gracefully on earlier versions of Windows. commit 811ff855a24323cafddc95c1b8aca711fef05f76 Author: Christian Demsar <[email protected]> Date: Thu Aug 10 10:28:27 2023 -0400 Add --n-predict -2 for stopping generation on full context (#2565) commit 37c9717aaa6815b6a5be21aaab970212f20fe6bf Author: Martin Krasser <[email protected]> Date: Thu Aug 10 12:16:38 2023 +0200 Fix grammar-based sampling issue in server (#2566) commit 9483288e0318a4dcc2e08eb817dfdd09c6552533 Merge: dae9dff b19edd5 Author: Concedo <[email protected]> Date: Sat Aug 12 16:04:11 2023 +0800 Merge branch 'master' into concedo_experimental # Conflicts: # Makefile commit b19edd54d51cef5e3616c18b1d0d8626895b2cba Author: byte-6174 <[email protected]> Date: Fri Aug 11 19:17:25 2023 -0400 Adding support for llama2.c models (#2559) commit 53dc399472d5bd35ee739b865e843b1996bd3814 Author: Equim <[email protected]> Date: Sat Aug 12 06:35:14 2023 +0800 server: fixed wrong variable name in timing json (#2579) * server: fixed wrong variable name in timing json * remove redunct entry commit dae9dffa6aa53923cfbb09ac5de7e08f34920733 Author: Concedo <[email protected]> Date: Fri Aug 11 14:54:27 2023 +0800 rename koboldcpp.dll to koboldcpp_default.dll commit 9ca4abed893685692f90413e4d43153af12342d9 Author: DannyDaemonic <[email protected]> Date: Thu Aug 10 13:11:36 2023 -0700 Handle `ENABLE_VIRTUAL_TERMINAL_PROCESSING` more gracefully on earlier versions of Windows. commit d18ecd5b9e5dde58ae08a3eef1637406159ddaca Author: YellowRoseCx <[email protected]> Date: Thu Aug 10 13:19:41 2023 -0500 make mmq gen faster for amd commit 243894a952147a4fac5b6aee748861a0df6cc2c6 Author: Henri Vasserman <[email protected]> Date: Thu Aug 10 12:14:40 2023 +0300 ws fix commit ac2f14da445ea87d73539adbd29d19ff2c9eba58 Author: Engininja2 <[email protected]> Date: Thu Aug 10 12:11:27 2023 +0300 AMD assembly optimized __dp4a Doesn't seem to work for gfx900, so commented out. commit 9dba0c985f140ddded8cbb671f139e81fff82eed Author: Henri Vasserman <[email protected]> Date: Thu Aug 10 12:09:28 2023 +0300 Fix merge --------- Co-authored-by: ardfork <[email protected]> Co-authored-by: Kerfuffle <[email protected]> commit e59fcb2bc129881f4a269fee748fb38bce0a64de Author: Christian Demsar <[email protected]> Date: Thu Aug 10 10:28:27 2023 -0400 Add --n-predict -2 for stopping generation on full context (#2565) commit 886f4eed7948f494e3da1d48d4f6f844e2f9a2c2 Author: Concedo <[email protected]> Date: Thu Aug 10 22:01:33 2023 +0800 updated lite, up ver, remove bell commit 1638757767072a4957f52b9e3594f0b67610631b Author: Martin Krasser <[email protected]> Date: Thu Aug 10 12:16:38 2023 +0200 Fix grammar-based sampling issue in server (#2566) commit c5f5209d37b09325377e36f39eab0b0f0c0d006e Author: Concedo <[email protected]> Date: Thu Aug 10 16:30:02 2023 +0800 globalize args commit f570b5cb1070591527a82d94bba408927b37778d Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 22:11:20 2023 -0500 Revert "revert cuda changes as they are bugggy" This reverts commit 1541bf879772aeeed8ff646bfc52185c2a88b79b. commit 1541bf879772aeeed8ff646bfc52185c2a88b79b Author: Concedo <[email protected]> Date: Wed Aug 9 22:36:41 2023 +0800 revert cuda changes as they are bugggy commit bacc20203efb1839aa313858a04d75255bb4b7f4 Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 20:37:17 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit b7cb4cfd109986bd66e8fd382d1e2516eaddfebb Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 20:00:52 2023 -0500 additional fixes commit fadae727baa3735ad3e0667384d6e05ca056b3ef Merge: 518eb2a 8f8ab6c Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 18:45:50 2023 -0500 Merge branch 'hipblas' into develop4Main commit 518eb2af9225f8300a108c4244c7eb0a2217c3bc Merge: bda0215 cae6a84 Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 18:32:10 2023 -0500 Merge remote-tracking branch 'upstream/concedo' into develop2Main commit bda0215b413bafc49890aa23fc35f96a191fb3e0 Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 18:17:54 2023 -0500 update makefile to multisystem path commit 8f8ab6c4c049df501e9a5ed8fef3aa0fc0691421 Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 18:05:03 2023 -0500 hipLDFLAG Path change Unix to multisystem in Makefile changed the hardcoded linux distro hipblas LD path from -L/opt/rocm/lib to use the defined ROCM_PATH variable to be flexible with ROCm on non-Linux OS commit 610ba4cfc460ed65c4adc32d3365a216690384d5 Merge: 4024f91 25d43e0 Author: Henri Vasserman <[email protected]> Date: Wed Aug 9 23:54:58 2023 +0300 Merge 'origin/master' into hipblas commit 916a9acdd0a411426690400ebe2bb7ce840a6bba Author: Sam Spilsbury <[email protected]> Date: Wed Aug 9 23:47:42 2023 +0300 ggml-alloc: Don't try to re-use buffers of external tensors (#2562) * ggml-alloc: Don't try to re-use buffers of external tensors They might be weights that came from another context, so we have no control over them (and they might be re-used elsewhere so writing to them would be a bad idea). * ggml-alloc: >= when checking for out-of-bounds Co-authored-by: slaren <[email protected]> --------- Co-authored-by: slaren <[email protected]> commit ea04a4ca1940d92becc0ee26523aa2c4a18cf938 Author: grahameth <[email protected]> Date: Wed Aug 9 22:46:40 2023 +0200 add log_callback to llama_context_params for custom logging. (#2234) * add log_callback to llama_context_params for custom logging. * Fix macro expansion on gcc * Add struct llama_state for global variables and move log_callback there * Turn log level into enum and some minor changes. * Remove model_for_logging parameter (not needed anymore) * Convert remaining fprintf(stderr, ...) calls to use new macros. * Fix enum and initialize g_state * Fix log calls after merge * Fix missing static * Add back all the new lines in the logging strings * Add comment for llama_log_callback and replace remaining printf calls --------- Co-authored-by: grahameth <-> Co-authored-by: Helmut <[email protected]> commit a07e6dd3ad1a622f08c3187799879d4f1c49bad4 Author: Concedo <[email protected]> Date: Wed Aug 9 22:36:41 2023 +0800 revert cuda changes as they are bugggy commit f8376c7e610f68d07e079ff91f6988fb7a8399e2 Author: Concedo <[email protected]> Date: Wed Aug 9 21:23:33 2023 +0800 up ver, fixed compile (+1 squashed commits) Squashed commits: [ca51aa9e] up ver commit ba09f1c807956c59d8c64988626e95459f627ced Merge: 3a7853d 25d43e0 Author: Concedo <[email protected]> Date: Wed Aug 9 21:18:34 2023 +0800 Merge branch 'master' into concedo_experimental # Conflicts: # README.md # ggml-cuda.cu commit 3a7853d259c242d4977e9f4dc7627a799d5812b4 Author: Concedo <[email protected]> Date: Wed Aug 9 21:07:57 2023 +0800 handle stablecode-completion-alpha-3b commit 25d43e0eb578b6e73046d9d6644a3a14d460600d Author: Johannes Gäßler <[email protected]> Date: Wed Aug 9 09:42:34 2023 +0200 CUDA: tuned mul_mat_q kernels (#2546) commit 90058d96b0c6ab77802e153c23fad66d2f21a438 Author: Concedo <[email protected]> Date: Wed Aug 9 15:28:07 2023 +0800 sleep longer before exit commit 19cf2a8663938c424407544c13749f371104517b Author: Concedo <[email protected]> Date: Wed Aug 9 12:42:59 2023 +0800 add idle field and up ver commit 4b8a354895e078d3f0cafdf53430d72d3af8bb99 Author: Concedo <[email protected]> Date: Wed Aug 9 12:25:21 2023 +0800 cudatoolkit version commit 159ad9269d95bc07720c79debc23b5c466357b53 Author: Concedo <[email protected]> Date: Wed Aug 9 11:50:12 2023 +0800 up ver, set the cuda pool malloc lookahead back to 5% instead of 2% (+1 squashed commits) Squashed commits: [e0f65278] up ver, set the cuda pool malloc lookahead back to 5% instead of 2% commit 4024f91a665d83b6de8658d45ec9d004c5d90c79 Author: Henri Vasserman <[email protected]> Date: Wed Aug 9 01:56:44 2023 +0300 Add intrinsics polyfills for AMD --------- Co-authored-by: ardfork <[email protected]> Co-authored-by: funnbot <[email protected]> Co-authored-by: Engininja2 <[email protected]> commit ab6212864ce8e9af200bcedb3e0126ee49aa8d0a Merge: d91456a f5bfea0 Author: Henri Vasserman <[email protected]> Date: Wed Aug 9 00:37:01 2023 +0300 Merge 'origin/master' into hipblas commit 926d90fbabe836d16a5326eb99bdcb89ca0fc042 Merge: 793cfd1 f5bfea0 Author: Concedo <[email protected]> Date: Wed Aug 9 01:09:04 2023 +0800 Merge branch 'master' into concedo_experimental # Conflicts: # Makefile commit 793cfd136cc721884f79d09036b748e4f176cdb4 Author: Concedo <[email protected]> Date: Wed Aug 9 01:05:00 2023 +0800 fixed 70B detection again, try fix horde issues, fixed lite unicode issue, fixed cmake for cuda commit f5bfea0580e417f99850d5456ca541d871a3e48c Author: Martin Krasser <[email protected]> Date: Tue Aug 8 15:29:19 2023 +0200 Allow passing grammar to completion endpoint (#2532) * Allow passing grammar to completion endpoint commit acfc5478ff3446ca3b54553967a3dea09b7c771a Author: Johannes Gäßler <[email protected]> Date: Tue Aug 8 14:38:16 2023 +0200 CUDA: tighter VRAM scratch size for 65b/70b (#2551) commit 7ed8d1fe7f8cbe6a6763e6b46759795ac8d21e12 Author: chaihahaha <[email protected]> Date: Tue Aug 8 20:07:02 2023 +0800 llm.vim : multiline autocompletion, get rid of "^@" (#2543) commit e7f94d6fdc83b41ba449b4b8c80821673dd12ffc Author: Georgi Gerganov <[email protected]> Date: Tue Aug 8 15:05:30 2023 +0300 vim : bring back simple llm.vim example commit 2d7baaf50f3277e65cf71071f61ea34823d14c30 Author: AustinMroz <[email protected]> Date: Tue Aug 8 06:44:48 2023 -0500 vim : streaming and more (#2495) * Update Vim plugin * Remove getbufoneline usage, Add input bind example. getbufoneline() appears to be a recently added function and has been replaced with getbufline for compatibility. An additional example that explains how to add a keybind that works in insert mode was added. commit f3c3b4b1672d860800639c87d3b5d17564692469 Author: klosax <[email protected]> Date: Mon Aug 7 19:07:19 2023 +0200 Add --rope-scale parameter (#2544) * common.cpp : Add --rope-scale parameter * README.md : Add info about using linear rope scaling commit 3554080502cb050ccc3ae11d7a67df866ac3bd07 Author: Concedo <[email protected]> Date: Tue Aug 8 00:41:02 2023 +0800 fixed blasbatchmul multiplier commit 28ad80b6e4d38dde9e395fc5d4ebf19dc4aa4b66 Merge: 3c7d938 93356bd Author: Concedo <[email protected]> Date: Tue Aug 8 00:34:10 2023 +0800 Merge branch 'master' into concedo_experimental commit 3c7d938d95fd51780be37f10cdddb2f26a770adf Author: Concedo <[email protected]> Date: Tue Aug 8 00:32:51 2023 +0800 update lite, resize scratch buffers for blasbatch 2048 commit 93356bdb7a324a8f6570f99d02af392cd4c45796 Author: Georgi Gerganov <[email protected]> Date: Mon Aug 7 14:25:58 2023 +0300 ggml : mul mat tweaks (#2372) * ggml : mul mat wip ggml-ci * ggml : alternative thread distribution for mul_mat ggml-ci * ggml : mul_mat block tiling attempt * ggml : mul_mat threads yield ggml-ci commit 60baff7c8584ec369e53469cad5f92e102b1efe4 Author: Georgi Gerganov <[email protected]> Date: Mon Aug 7 14:24:42 2023 +0300 ggml : pad result of ggml_nbytes() commit 9082b5dfbfae01243a0b822dcd2812877e63bf1b Author: Georgi Gerganov <[email protected]> Date: Mon Aug 7 13:55:18 2023 +0300 ggml : change params pointer (style change) (#2539) ggml-ci commit 99d29c0094476c4962023036ecd61a3309d0e16b Author: Georgi Gerganov <[email protected]> Date: Mon Aug 7 13:20:09 2023 +0300 ggml : sync (custom ops) (#2537) ggml-ci commit 9133e456d2d52b05c6c7f92cd94a0d2564ddb2f7 Merge: cae6a84 3d9a551 Author: Concedo <[email protected]> Date: Mon Aug 7 17:33:42 2023 +0800 Merge branch 'master' into concedo_experimental # Conflicts: # Makefile # build.zig commit cae6a847ada88e415b0beda09d70d79b51762618 Author: Concedo <[email protected]> Date: Mon Aug 7 16:40:13 2023 +0800 cuda free only for non mmq (+2 squashed commit) Squashed commit: [3aca763a] only cuda free for non mmq [e69a8c9f] revert to pool alloc to try again commit 3d9a55181603e85a26378a850a14068034e5002d Author: Johannes Gäßler <[email protected]> Date: Mon Aug 7 10:09:40 2023 +0200 Fixed mmap prefetch for GPU offloading (#2529) commit f6f9896ac3d2ff207e18f87dab85d126ceef5236 Author: Georgi Gerganov <[email protected]> Date: Mon Aug 7 10:52:57 2023 +0300 metal : fix out-of-bounds access + inc concurrency nodes (#2416) * metal : fix out-of-bounds access + style changes * metal : increase concurrency nodes to 2*GGML_MAX_NODES commit 9f16a4c4efc5cca845e027c1dbad615612b9248c Author: Concedo <[email protected]> Date: Mon Aug 7 15:16:37 2023 +0800 switch to upstream implementation of pool malloc commit 34a14b28ff7f3c98730339bacee035091b2a812a Author: GiviMAD <[email protected]> Date: Sun Aug 6 23:21:46 2023 -0700 [Makefile] Move ARM CFLAGS before compilation (#2536) commit 7297128db8159c7b12db4c28a4532b993025c2e5 Author: Henri Vasserman <[email protected]> Date: Mon Aug 7 08:35:53 2023 +0300 [Zig] Rewrite build for Zig 0.11 (#2514) * zig build fixes * Disable LTO on Windows. commit 6659652c9fd1853dcb2d1882efc8f14b159d5d43 Author: Concedo <[email protected]> Date: Mon Aug 7 11:05:06 2023 +0800 lower actual temp used when temp=0 commit 0e41b94f40e1d10893d6ac29c727482573ef1652 Author: Concedo <[email protected]> Date: Mon Aug 7 10:43:06 2023 +0800 improve detection for 70B. commit fb44d72a78a81790d238ffd2453cf66d02eed688 Merge: 559c0e2 d9024df Author: Concedo <[email protected]> Date: Mon Aug 7 10:17:43 2023 +0800 Merge remote-tracking branch 'johannes/cuda-fix-mmap-prefetch' into concedo_experimental commit 559c0e2d1f621402d410944b5291da647243ab33 Author: Concedo <[email protected]> Date: Mon Aug 7 10:15:20 2023 +0800 updated lite again, fix for wi commit d9024df759b25d030fc8266d399c565fe7be9a04 Author: JohannesGaessler <[email protected]> Date: Sun Aug 6 10:18:05 2023 +0200 Fixed mmap prefetch for GPU offloading commit d442888626f11335e0c9e3b8555d2429b3262580 Merge: 198cc82 86c3219 Author: Concedo <[email protected]> Date: Sun Aug 6 22:47:33 2023 +0800 Merge branch 'master' into concedo_experimental # Conflicts: # M…

commit 3416c98 Merge: 5eb17f0 4c4e435 Author: YellowRoseCx <[email protected]> Date: Fri Aug 25 13:46:56 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit 5eb17f0 Author: YellowRoseCx <[email protected]> Date: Fri Aug 25 13:38:21 2023 -0500 ROCm Port update * use hipblas based on cublas * Update Makefile for the Cuda kernels * Expand arch list and make it overrideable * Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5) * add hipBLAS to README * new build arg LLAMA_CUDA_MMQ_Y * fix half2 decomposition * Add intrinsics polyfills for AMD * AMD assembly optimized __dp4a * Allow overriding CC_TURING * use "ROCm" instead of "CUDA" * ignore all build dirs * Add Dockerfiles * fix llama-bench * fix -nommq help for non CUDA/HIP --------- Co-Authored-By: YellowRoseCx <[email protected]> Co-Authored-By: ardfork <[email protected]> Co-Authored-By: funnbot <[email protected]> Co-Authored-By: Engininja2 <[email protected]> Co-Authored-By: Kerfuffle <[email protected]> Co-Authored-By: jammm <[email protected]> Co-Authored-By: jdecourval <[email protected]> commit b34f4bd Author: YellowRoseCx <[email protected]> Date: Sat Aug 19 17:12:52 2023 -0500 Update README.md commit 7d11961 Author: YellowRoseCx <[email protected]> Date: Mon Aug 14 23:03:12 2023 -0500 remove force DMMV commit cd61aa0 Author: YellowRoseCx <[email protected]> Date: Sat Aug 12 17:24:31 2023 -0500 restore main_gpu parameter commit 4a042f3 Author: Henri Vasserman <[email protected]> Date: Sat Aug 12 10:51:46 2023 +0300 gfx1100 support --------- Co-authored-by: ardfork <[email protected]> Co-authored-by: jammm <[email protected]> Co-authored-by: jdecourval <[email protected]> commit 8913bc6 Author: Henri Vasserman <[email protected]> Date: Fri Aug 11 10:16:02 2023 +0300 Allow overriding CC_TURING commit e77a4c3 Author: Henri Vasserman <[email protected]> Date: Fri Aug 11 10:00:07 2023 +0300 Merge 'origin/master' into hipblas commit cc4c4e3 Author: Engininja2 <[email protected]> Date: Fri Aug 11 09:43:14 2023 +0300 New __dp4a assembly Now compatible with gfx900 and faster as well. commit 1a03b70 Author: Henri Vasserman <[email protected]> Date: Fri Aug 11 09:30:28 2023 +0300 Undo mess --------- Co-authored-by: ardfork <[email protected]> commit 4366ff9 Author: DannyDaemonic <[email protected]> Date: Thu Aug 10 13:11:36 2023 -0700 Handle `ENABLE_VIRTUAL_TERMINAL_PROCESSING` more gracefully on earlier versions of Windows. commit 811ff85 Author: Christian Demsar <[email protected]> Date: Thu Aug 10 10:28:27 2023 -0400 Add --n-predict -2 for stopping generation on full context (ggml-org#2565) commit 37c9717 Author: Martin Krasser <[email protected]> Date: Thu Aug 10 12:16:38 2023 +0200 Fix grammar-based sampling issue in server (ggml-org#2566) commit d18ecd5 Author: YellowRoseCx <[email protected]> Date: Thu Aug 10 13:19:41 2023 -0500 make mmq gen faster for amd commit 243894a Author: Henri Vasserman <[email protected]> Date: Thu Aug 10 12:14:40 2023 +0300 ws fix commit ac2f14d Author: Engininja2 <[email protected]> Date: Thu Aug 10 12:11:27 2023 +0300 AMD assembly optimized __dp4a Doesn't seem to work for gfx900, so commented out. commit 9dba0c9 Author: Henri Vasserman <[email protected]> Date: Thu Aug 10 12:09:28 2023 +0300 Fix merge --------- Co-authored-by: ardfork <[email protected]> Co-authored-by: Kerfuffle <[email protected]> commit f570b5c Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 22:11:20 2023 -0500 Revert "revert cuda changes as they are bugggy" This reverts commit 1541bf8. commit 1541bf8 Author: Concedo <[email protected]> Date: Wed Aug 9 22:36:41 2023 +0800 revert cuda changes as they are bugggy commit bacc202 Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 20:37:17 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit b7cb4cf Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 20:00:52 2023 -0500 additional fixes commit fadae72 Merge: 518eb2a 8f8ab6c Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 18:45:50 2023 -0500 Merge branch 'hipblas' into develop4Main commit 518eb2a Merge: bda0215 cae6a84 Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 18:32:10 2023 -0500 Merge remote-tracking branch 'upstream/concedo' into develop2Main commit bda0215 Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 18:17:54 2023 -0500 update makefile to multisystem path commit 8f8ab6c Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 18:05:03 2023 -0500 hipLDFLAG Path change Unix to multisystem in Makefile changed the hardcoded linux distro hipblas LD path from -L/opt/rocm/lib to use the defined ROCM_PATH variable to be flexible with ROCm on non-Linux OS commit 610ba4c Merge: 4024f91 25d43e0 Author: Henri Vasserman <[email protected]> Date: Wed Aug 9 23:54:58 2023 +0300 Merge 'origin/master' into hipblas commit 4024f91 Author: Henri Vasserman <[email protected]> Date: Wed Aug 9 01:56:44 2023 +0300 Add intrinsics polyfills for AMD --------- Co-authored-by: ardfork <[email protected]> Co-authored-by: funnbot <[email protected]> Co-authored-by: Engininja2 <[email protected]> commit ab62128 Merge: d91456a f5bfea0 Author: Henri Vasserman <[email protected]> Date: Wed Aug 9 00:37:01 2023 +0300 Merge 'origin/master' into hipblas commit ee9fa2a Author: YellowRoseCx <[email protected]> Date: Wed Aug 2 01:53:58 2023 -0500 Update Makefile commit d91456a Author: ardfork <[email protected]> Date: Mon Jul 31 20:35:00 2023 +0300 fix half2 decomposition commit c1cb70d Author: Henri Vasserman <[email protected]> Date: Mon Jul 31 19:56:44 2023 +0300 new build arg LLAMA_CUDA_MMQ_Y commit c1664a0 Merge: 4336231 0728c5a Author: Henri Vasserman <[email protected]> Date: Mon Jul 31 19:32:27 2023 +0300 Merge 'origin/master' into hipblas commit 848558d Author: YellowRoseCx <[email protected]> Date: Sun Jul 30 20:02:52 2023 -0500 import vars logic fix commit b650b84 Author: YellowRoseCx <[email protected]> Date: Sun Jul 30 00:21:36 2023 -0500 Update easy_KCPP-ROCm_install.sh commit 8573a67 Author: YellowRoseCx <[email protected]> Date: Sat Jul 29 21:31:12 2023 -0500 remove duplicate code and fix typo remove duplicate tooltip commit 430986e Author: YellowRoseCx <[email protected]> Date: Sat Jul 29 21:07:34 2023 -0500 hide "missing" if all are built move tooltip functions to helper functions section. hides the string "Missing: ..." from showing if all backends are available " if len(runopts)==6 else + " commit dd0db72 Author: YellowRoseCx <[email protected]> Date: Sat Jul 29 20:52:31 2023 -0500 hide "missing" if all are built move tooltip functions to helper functions section. hides the string "Missing: ..." from showing if all backends are available commit 43fffb6 Merge: 0ed65a4 b40550c Author: YellowRoseCx <[email protected]> Date: Sat Jul 29 19:13:15 2023 -0500 Merge branch 'concedo' commit 0ed65a4 Author: YellowRoseCx <[email protected]> Date: Sat Jul 29 18:34:21 2023 -0500 Hide unavailable backends & Add tooltip over backend count Hides unavailable backends from the user and if the program is launched without any backends made, it shows an error message to them stating no backends were found and to make them using the 'make' command Add tooltip when hovering over backend count label hovering over the new label that shows the backend count will explain what the numbers are, and show the users which backends are not available or built commit 2a26398 Merge: cee2e9d 31486eb Author: YellowRoseCx <[email protected]> Date: Sat Jul 29 15:16:33 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit 4336231 Author: Henri Vasserman <[email protected]> Date: Sat Jul 29 18:35:56 2023 +0300 add hipBLAS to README --------- Co-authored-by: ardfork <[email protected]> commit f8e3fc6 Author: Henri Vasserman <[email protected]> Date: Sat Jul 29 14:16:46 2023 +0300 rocblas init stuff commit d2ade63 Merge: cde52d6 8a88e58 Author: Henri Vasserman <[email protected]> Date: Sat Jul 29 12:59:48 2023 +0300 Merge 'origin/master' into hipblas commit cee2e9d Author: YellowRoseCx <[email protected]> Date: Wed Jul 26 23:36:55 2023 -0500 Only Show Available Backends in GUI Hides unavailable backends from the user and if the program is launched without any backends made, it shows an error message to them stating no backends were found and to make them using the 'make' command commit 7863610 Author: YellowRoseCx <[email protected]> Date: Wed Jul 26 13:27:22 2023 -0500 Update easy_KCPP-ROCm_install.sh commit 731cd6e Author: YellowRoseCx <[email protected]> Date: Tue Jul 25 22:39:50 2023 -0500 Create easy_rocm_install.sh commit f154685 Merge: cbdc1f3 94e0a06 Author: YellowRoseCx <[email protected]> Date: Tue Jul 25 22:25:10 2023 -0500 Merge branch 'concedo_experimentalMAIN' commit cbdc1f3 Merge: 5b838d4 9731682 Author: YellowRoseCx <[email protected]> Date: Mon Jul 24 16:53:21 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit cde52d6 Merge: 8e8054a 84e09a7 Author: Henri Vasserman <[email protected]> Date: Mon Jul 24 12:22:58 2023 +0300 Merge 'origin/master' into hipblas commit 8e8054a Author: Henri Vasserman <[email protected]> Date: Mon Jul 24 12:20:49 2023 +0300 Add rocblas to build files commit 1f6294d Author: YellowRoseCx <[email protected]> Date: Mon Jul 24 03:52:01 2023 -0500 Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5) * initialize rocblas commit 5b838d4 Author: YellowRoseCx <[email protected]> Date: Mon Jul 24 03:10:35 2023 -0500 amd multigpu full layer offload w/o vram scratch commit 9bfb2fd Merge: b379f9d 66328fc Author: YellowRoseCx <[email protected]> Date: Mon Jul 24 03:07:44 2023 -0500 Merge branch 'concedo_experimental' commit b379f9d Author: YellowRoseCx <[email protected]> Date: Mon Jul 24 03:07:00 2023 -0500 Revert "amd multigpu full layer offload w/o vram scratch" This reverts commit 9adfc8e. commit 9adfc8e Author: YellowRoseCx <[email protected]> Date: Mon Jul 24 02:56:40 2023 -0500 amd multigpu full layer offload w/o vram scratch commit 05c792e Author: YellowRoseCx <[email protected]> Date: Mon Jul 24 00:18:48 2023 -0500 initialize rocblas commit ade68d0 Merge: 521ad6b 56995ca Author: YellowRoseCx <[email protected]> Date: Sun Jul 23 20:25:05 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit 521ad6b Author: YellowRoseCx <[email protected]> Date: Thu Jul 20 21:42:33 2023 -0500 lazy import_var error handling for saves commit 9553e52 Merge: cac6650 f036109 Author: YellowRoseCx <[email protected]> Date: Thu Jul 20 19:59:41 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit cac6650 Author: YellowRoseCx <[email protected]> Date: Mon Jul 17 23:05:02 2023 -0500 Makefile fix! Allows hip/clblast build together commit 3db70b5 Merge: 2ec4466 7568d1a Author: Henri Vasserman <[email protected]> Date: Tue Jul 18 01:54:17 2023 +0300 Merge 'origin/master' into hipblas commit f208670 Author: YellowRoseCx <[email protected]> Date: Fri Jul 14 02:56:03 2023 -0500 improve error handling with gpu names commit 860e738 Author: YellowRoseCx <[email protected]> Date: Fri Jul 14 00:33:03 2023 -0500 Show GPU names in GUI, Only show GPUs that exist changed the pre-set 1,2,3 and 1,2,3,all settings that the GPU selector had and replaced them with a function that grabs the GPU names and sets the names as the values for the selector boxes. commit 2ec4466 Author: Henri Vasserman <[email protected]> Date: Thu Jul 13 13:44:02 2023 +0300 Update build flags. GGML_CUDA_DMMV_Y is now GGML_CUDA_MMV_Y so update your build instructions. GGML_CUDA_FORCE_DMMV is always enabled. --------- Co-authored-by: YellowRoseCx <[email protected]> commit cd36b18 Merge: afcb8fe 1cbf561 Author: Henri Vasserman <[email protected]> Date: Thu Jul 13 13:03:01 2023 +0300 Merge 'origin/master' into hipblas commit ac7ebc3 Author: YellowRoseCx <[email protected]> Date: Wed Jul 12 18:32:18 2023 -0500 add hipBLAS name scheme to GUI and update README commit 7f85cc5 Author: YellowRoseCx <[email protected]> Date: Wed Jul 12 17:35:54 2023 -0500 update makefile and ggml.c commit 6ca3499 Author: YellowRoseCx <[email protected]> Date: Wed Jul 12 15:43:45 2023 -0500 ggml.c fix commit 770e674 Merge: 2b289cd 5941514 Author: YellowRoseCx <[email protected]> Date: Wed Jul 12 15:24:36 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit 2b289cd Author: YellowRoseCx <[email protected]> Date: Wed Jul 12 14:30:00 2023 -0500 Update c-cpp.yml commit 5dae95a Author: YellowRoseCx <[email protected]> Date: Wed Jul 12 14:28:51 2023 -0500 Update c-cpp.yml commit b37cd73 Author: YellowRoseCx <[email protected]> Date: Wed Jul 12 14:27:04 2023 -0500 Create c-cpp.yml to test Actions commit afcb8fe Author: Henri Vasserman <[email protected]> Date: Tue Jul 11 18:09:27 2023 +0300 Add new config option commit 8c2c497 Merge: e610466 2347463 Author: Henri Vasserman <[email protected]> Date: Tue Jul 11 17:53:54 2023 +0300 Merge 'origin/master' into hipblas commit e610466 Author: Henri Vasserman <[email protected]> Date: Tue Jul 11 17:53:14 2023 +0300 Expand arch list and make it overrideable commit 80e4e54 Merge: 7735c5a 1d16309 Author: Henri Vasserman <[email protected]> Date: Mon Jul 10 02:09:28 2023 +0300 Merge 'origin/master' into hipblas commit 8432e9d Author: YellowRoseCx <[email protected]> Date: Sun Jul 9 16:55:30 2023 -0500 Update Makefile commit b58c189 Author: YellowRoseCx <[email protected]> Date: Sun Jul 9 16:20:00 2023 -0500 Add multi-gpu CuBLAS support to new GUI commit 0c1c71b Author: YellowRoseCx <[email protected]> Date: Sat Jul 8 07:56:57 2023 -0500 Update Makefile commit f864f60 Author: Johannes Gäßler <[email protected]> Date: Sat Jul 8 00:25:15 2023 +0200 CUDA: add __restrict__ to mul mat vec kernels (ggml-org#2140) commit 4539bc2 Author: YellowRoseCx <[email protected]> Date: Sat Jul 8 01:36:14 2023 -0500 update makefile for changes commit 912e31e Merge: 74e2703 ddaa4f2 Author: YellowRoseCx <[email protected]> Date: Fri Jul 7 23:15:37 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit 74e2703 Merge: cf65429 f9108ba Author: YellowRoseCx <[email protected]> Date: Wed Jul 5 15:16:49 2023 -0500 Merge branch 'LostRuins:concedo' into main commit 7735c5a Merge: c3e3733 7ee76e4 Author: Henri Vasserman <[email protected]> Date: Tue Jul 4 17:09:16 2023 +0300 Merge 'origin/master' into hipblas commit cf65429 Author: YellowRoseCx <[email protected]> Date: Mon Jul 3 16:56:40 2023 -0500 print cuda or opencl based on what's used commit 72c16d2 Author: YellowRoseCx <[email protected]> Date: Mon Jul 3 16:45:39 2023 -0500 Revert "fix my mistake that broke other arches" This reverts commit 777aed5. commit 777aed5 Author: YellowRoseCx <[email protected]> Date: Mon Jul 3 15:53:32 2023 -0500 fix my mistake that broke other arches commit 27780a9 Author: YellowRoseCx <[email protected]> Date: Sun Jul 2 16:03:27 2023 -0500 rocm fixes commit f52c7d4 Author: YellowRoseCx <[email protected]> Date: Sun Jul 2 16:02:58 2023 -0500 Revert "rocm fixes" This reverts commit 2fe9927. commit 2fe9927 Author: YellowRoseCx <[email protected]> Date: Sun Jul 2 15:58:21 2023 -0500 rocm fixes commit efe7560 Author: YellowRoseCx <[email protected]> Date: Sun Jul 2 15:55:43 2023 -0500 Revert "move HIPBLAS definitions into ggml-cuda.h" This reverts commit bf49a93. commit 4fc0181 Author: YellowRoseCx <[email protected]> Date: Sun Jul 2 15:55:36 2023 -0500 Revert "move hipblas definitions to header files" This reverts commit 2741ffb. commit 89eb576 Merge: 2741ffb 3d2907d Author: YellowRoseCx <[email protected]> Date: Sun Jul 2 14:44:13 2023 -0500 Merge branch 'LostRuins:concedo' into main commit c3e3733 Author: Henri Vasserman <[email protected]> Date: Sun Jul 2 15:51:31 2023 +0300 ROCm fixes commit 15db19a Merge: 04419f1 46088f7 Author: Henri Vasserman <[email protected]> Date: Sun Jul 2 15:39:57 2023 +0300 Merge 'origin/master' into hipblas commit 2741ffb Author: YellowRoseCx <[email protected]> Date: Sat Jul 1 17:07:42 2023 -0500 move hipblas definitions to header files commit bf49a93 Author: YellowRoseCx <[email protected]> Date: Sat Jul 1 16:38:50 2023 -0500 move HIPBLAS definitions into ggml-cuda.h commit 540f4e0 Merge: 2c3b46f eda663f Author: YellowRoseCx <[email protected]> Date: Sat Jul 1 14:58:32 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit 2c3b46f Author: YellowRoseCx <[email protected]> Date: Thu Jun 29 18:43:43 2023 -0500 changes to fix build commit c9e1103 Author: YellowRoseCx <[email protected]> Date: Thu Jun 29 18:20:07 2023 -0500 Update ggml_v2-cuda-legacy.cu for ROCM commit b858fc5 Author: YellowRoseCx <[email protected]> Date: Thu Jun 29 17:49:39 2023 -0500 changes to work with upstream commit 69a0c25 Merge: 096f0b0 1347d3a Author: YellowRoseCx <[email protected]> Date: Thu Jun 29 16:59:06 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit 04419f1 Merge: bb16eff d3494bb Author: Henri Vasserman <[email protected]> Date: Wed Jun 28 23:30:10 2023 +0300 Merge 'origin/master' into hipblas commit bb16eff Author: YellowRoseCx <[email protected]> Date: Wed Jun 28 15:27:10 2023 -0500 headers fix; add kquants_iter for hipblas and add gfx803 (#1) * kquants_iter for hipblas and add gfx803 * Update CMakeLists.txt with hipblas kquants_iter and DMMV_F16 * remove dmmv_f16 for now commit 096f0b0 Author: YellowRoseCx <[email protected]> Date: Wed Jun 28 15:27:02 2023 -0500 revert unnecessary hipblas conditionals commit d81e81a Author: YellowRoseCx <[email protected]> Date: Wed Jun 28 14:48:23 2023 -0500 Update Makefile hipblas nvcc correction commit c8ae945 Merge: c1e5c83 0be54f7 Author: Henri Vasserman <[email protected]> Date: Tue Jun 27 10:50:37 2023 +0300 Merge 'origin/master' into hipblas commit 2579ecf Merge: abed427 d2034ce Author: YellowRoseCx <[email protected]> Date: Sun Jun 25 17:50:04 2023 -0500 Merge branch 'LostRuins:concedo' into main commit c1e5c83 Merge: 35a6031 447ccbe Author: Henri Vasserman <[email protected]> Date: Sun Jun 25 21:40:05 2023 +0300 Merge 'origin/master' into hipblas commit 35a6031 Merge: df7346c 66a2555 Author: Henri Vasserman <[email protected]> Date: Sun Jun 25 10:57:48 2023 +0300 Merge 'origin/master' into hipblas commit abed427 Author: YellowRoseCx <[email protected]> Date: Sat Jun 24 19:16:30 2023 -0500 reorganize If statements to include proper headers commit 06c3bf0 Merge: ea6d320 8342fe8 Author: YellowRoseCx <[email protected]> Date: Sat Jun 24 16:57:20 2023 -0500 Merge branch 'LostRuins:concedo' into main commit ea6d320 Author: YellowRoseCx <[email protected]> Date: Fri Jun 23 01:53:28 2023 -0500 Update README.md commit 4d56ad8 Author: YellowRoseCx <[email protected]> Date: Thu Jun 22 16:19:43 2023 -0500 Update README.md commit 21f9308 Author: YellowRoseCx <[email protected]> Date: Thu Jun 22 15:42:05 2023 -0500 kquants_iter for hipblas and add gfx803 commit df7346c Merge: 5dd2fbe 7487137 Author: Henri Vasserman <[email protected]> Date: Thu Jun 22 20:51:09 2023 +0300 Merge 'origin/master' into hipblas commit b6ff890 Merge: eb094f0 e6ddb15 Author: YellowRoseCx <[email protected]> Date: Thu Jun 22 12:42:09 2023 -0500 Merge branch 'LostRuins:concedo' into main commit eb094f0 Author: YellowRoseCx <[email protected]> Date: Wed Jun 21 23:59:18 2023 -0500 lowvram parameter description commit 3a5dfeb Merge: 665cc11 b1f00fa Author: YellowRoseCx <[email protected]> Date: Wed Jun 21 16:53:03 2023 -0500 Merge branch 'LostRuins:concedo' into koboldcpp-rocm commit 665cc11 Author: YellowRoseCx <[email protected]> Date: Wed Jun 21 01:13:19 2023 -0500 add lowvram parameter commit 222cbbb Author: YellowRoseCx <[email protected]> Date: Tue Jun 20 19:03:28 2023 -0500 add additional hipblas conditions for cublas commit e1f9581 Author: YellowRoseCx <[email protected]> Date: Tue Jun 20 16:51:59 2023 -0500 Add hip def for cuda v2 commit 3bff5c0 Merge: a7e74b3 266d47a Author: YellowRoseCx <[email protected]> Date: Tue Jun 20 13:38:06 2023 -0500 Merge branch 'LostRuins:concedo' into koboldcpp-rocm commit a7e74b3 Author: YellowRoseCx <[email protected]> Date: Mon Jun 19 22:04:18 2023 -0500 Update README.md commit 5e99b3c Author: YellowRoseCx <[email protected]> Date: Mon Jun 19 22:03:42 2023 -0500 Update Makefile commit 9190b17 Author: YellowRoseCx <[email protected]> Date: Mon Jun 19 21:47:10 2023 -0500 Update README.md commit 5dd2fbe Merge: 67e229b 20568fe Author: Henri Vasserman <[email protected]> Date: Tue Jun 20 01:23:12 2023 +0300 Merge 'origin/master' into hipblas commit 2780ea2 Author: YellowRoseCx <[email protected]> Date: Sun Jun 18 15:48:00 2023 -0500 Update Makefile commit 04a3e64 Author: YellowRoseCx <[email protected]> Date: Sun Jun 18 14:33:39 2023 -0500 remove extra line commit cccbca9 Author: YellowRoseCx <[email protected]> Date: Sun Jun 18 14:31:17 2023 -0500 attempt adding ROCM hipblas commit a44a1d4 Author: YellowRoseCx <[email protected]> Date: Sun Jun 18 14:31:01 2023 -0500 attempt adding ROCM hipblas commit b088184 Author: YellowRoseCx <[email protected]> Date: Sun Jun 18 14:30:54 2023 -0500 attempt adding ROCM hipblas commit 67e229b Merge: 6f7c156 b241649 Author: Henri Vasserman <[email protected]> Date: Sun Jun 18 00:36:54 2023 +0300 Merge 'origin/master' into hipblas commit 6f7c156 Merge: 61df8e9 fc45a81 Author: Henri Vasserman <[email protected]> Date: Sat Jun 17 16:53:22 2023 +0300 Merge 'origin/master' into hipblas commit 61df8e9 Author: Henri Vasserman <[email protected]> Date: Wed Jun 14 22:46:10 2023 +0300 add cudaMemset commit a836529 Merge: 85f902d 254a7a7 Author: Henri Vasserman <[email protected]> Date: Wed Jun 14 22:41:55 2023 +0300 Merge 'origin/master' into hipblas commit 85f902d Merge: 4362e80 b50b570 Author: Henri Vasserman <[email protected]> Date: Thu Jun 8 10:50:28 2023 +0300 Merge 'origin/master' into hipblas commit 4362e80 Merge: fa5b3d7 17366df Author: Henri Vasserman <[email protected]> Date: Tue Jun 6 23:14:40 2023 +0300 Merge 'origin/master' into hipblas commit fa5b3d7 Author: Henri Vasserman <[email protected]> Date: Tue Jun 6 18:47:00 2023 +0300 fix makefile. commit 1ba4ce4 Author: Henri Vasserman <[email protected]> Date: Tue Jun 6 18:41:08 2023 +0300 Revert "warp size fixes" It seems like 32 is faster for me, at least and it won't cause so many conflicts. This reverts commit 5d6eb72. commit 5d6eb72 Author: Henri Vasserman <[email protected]> Date: Tue Jun 6 18:32:41 2023 +0300 warp size fixes commit 33091a9 Merge: 9fdaa1d 2d43387 Author: Henri Vasserman <[email protected]> Date: Tue Jun 6 16:19:23 2023 +0300 Merge 'origin/master' into hipblas commit 9fdaa1d Author: Henri Vasserman <[email protected]> Date: Sat May 27 19:17:53 2023 +0300 Add more defs For forward compatibility LostRuins#1607 commit a4648c1 Merge: 4c8b3fb 0ecb1bb Author: Henri Vasserman <[email protected]> Date: Sat May 27 18:22:39 2023 +0300 Merge 'origin/master' into hipblas commit 4c8b3fb Author: Henri Vasserman <[email protected]> Date: Fri May 26 01:08:53 2023 +0300 add configurable vars commit 30d921a Author: Henri Vasserman <[email protected]> Date: Fri May 26 01:03:56 2023 +0300 and makefile commit a593a4f Author: Henri Vasserman <[email protected]> Date: Fri May 26 00:55:28 2023 +0300 Add missing parameters commit 174bf6a Merge: f80ce7a 1fcdcc2 Author: Henri Vasserman <[email protected]> Date: Fri May 26 00:44:23 2023 +0300 Merge 'origin/master' into hipblas commit f80ce7a Merge: 600ace3 ac7876a Author: Henri Vasserman <[email protected]> Date: Thu May 25 00:02:50 2023 +0300 Merge branch 'origin/master' into hipblas commit 600ace3 Author: Henri Vasserman <[email protected]> Date: Sat May 20 23:42:20 2023 +0300 update warp size commit b19fefe Author: Henri Vasserman <[email protected]> Date: Sat May 20 23:28:08 2023 +0300 Forwardcompat commit c66115b Merge: a0b2d5f b8ee340 Author: Henri Vasserman <[email protected]> Date: Sat May 20 18:29:31 2023 +0300 Merge 'origin/master' into hipblas commit a0b2d5f Merge: 8bab456 2a5ee02 Author: Henri Vasserman <[email protected]> Date: Tue May 16 17:08:29 2023 +0300 Merge 'origin/master' into hipblas commit 8bab456 Merge: 2956630 b5c9295 Author: Henri Vasserman <[email protected]> Date: Mon May 15 00:01:12 2023 +0300 Merge 'origin/master' into hipblas commit 2956630 Merge: 0fe6384 f048af0 Author: Henri Vasserman <[email protected]> Date: Sat May 13 13:12:52 2023 +0300 Merge 'origin/master' into hipblas commit 0fe6384 Author: Henri Vasserman <[email protected]> Date: Fri May 12 17:22:11 2023 +0300 fix makefile commit 605560d Merge: 127f68e 089b1c9 Author: Henri Vasserman <[email protected]> Date: Fri May 12 16:12:53 2023 +0300 Merge 'origin/master' into hipblas commit 127f68e Merge: 070cbcc b608b55 Author: Henri Vasserman <[email protected]> Date: Thu May 11 20:21:27 2023 +0300 Merge 'origin/master' into hipblas commit 070cbcc Author: Henri Vasserman <[email protected]> Date: Sun May 7 18:10:56 2023 +0300 occupanct function commit a3296d5 Merge: 0aefa6a e129551 Author: Henri Vasserman <[email protected]> Date: Sun May 7 18:06:04 2023 +0300 Merge 'origin/master' into hipblas commit 0aefa6a Merge: baeb482 1b0fd45 Author: Henri Vasserman <[email protected]> Date: Sun May 7 12:24:41 2023 +0300 Merge 'origin/master' into hipblas commit baeb482 Author: Henri Vasserman <[email protected]> Date: Sun May 7 12:24:12 2023 +0300 Revert to default copy commit 289073a Merge: 1107194 173d0e6 Author: Henri Vasserman <[email protected]> Date: Sat May 6 19:59:41 2023 +0300 Merge 'origin/master' into hipblas commit 1107194 Merge: 04c0d48 a3b85b2 Author: Henri Vasserman <[email protected]> Date: Sat May 6 00:38:20 2023 +0300 Merge 'origin/master' into hipblas commit 04c0d48 Author: Henri Vasserman <[email protected]> Date: Thu May 4 12:31:16 2023 +0300 Move all HIP stuff to ggml-cuda.cu commit d83cfba Merge: b67cc50 799fdc1 Author: Henri Vasserman <[email protected]> Date: Thu May 4 11:31:16 2023 +0300 Merge 'origin/master' into hipblas commit b67cc50 Merge: fcbc262 e216aa0 Author: Henri Vasserman <[email protected]> Date: Wed May 3 15:04:51 2023 +0300 Merge 'origin/master' into hipblas commit fcbc262 Merge: c73def1 f4cef87 Author: Henri Vasserman <[email protected]> Date: Mon May 1 22:45:29 2023 +0300 Merge 'origin/master' into hipblas commit c73def1 Merge: d8ea75e f0d70f1 Author: Henri Vasserman <[email protected]> Date: Sun Apr 30 18:40:42 2023 +0300 Merge 'origin/master' into hipblas commit d8ea75e Merge: d194586 334637e Author: Henri Vasserman <[email protected]> Date: Sat Apr 29 11:25:51 2023 +0300 Merge 'origin/master' into hipblas commit d194586 Merge: 2ab9d11 7f15c5c Author: Henri Vasserman <[email protected]> Date: Fri Apr 28 23:03:52 2023 +0300 Merge 'origin/master' into hipblas commit 2ab9d11 Merge: 3b4a531 04aaae1 Author: Henri Vasserman <[email protected]> Date: Fri Apr 28 16:30:05 2023 +0300 Merge 'origin/master' into hipblas commit 3b4a531 Merge: a1caa48 0b2da20 Author: Henri Vasserman <[email protected]> Date: Fri Apr 28 10:08:41 2023 +0300 Merge 'origin/master' into hipblas commit a1caa48 Author: Henri Vasserman <[email protected]> Date: Fri Apr 28 10:08:21 2023 +0300 add more cuda defines This is so 'slaren/cuda-f16f32' would merge. commit ecc0565 Author: Henri Vasserman <[email protected]> Date: Fri Apr 28 01:58:27 2023 +0300 only .cu file needs to be complied as device commit ef51e9e Merge: d571d16 4afcc37 Author: Henri Vasserman <[email protected]> Date: Wed Apr 26 12:46:26 2023 +0300 Merge branch 'ggerganov:master' into hipblas commit d571d16 Merge: 608aa33 dd0eabc Author: Henri Vasserman <[email protected]> Date: Tue Apr 25 21:15:33 2023 +0300 Merge 'origin/master' into hipblas commit 608aa33 Author: Henri Vasserman <[email protected]> Date: Tue Apr 25 21:15:04 2023 +0300 change default GPU arch to match CMake commit 3a004b2 Author: Henri Vasserman <[email protected]> Date: Mon Apr 24 02:24:54 2023 +0300 add rpath commit db7a012 Merge: 3677235 284685f Author: Henri Vasserman <[email protected]> Date: Sun Apr 23 21:49:28 2023 +0300 Merge 'origin/master' into hipblas commit 3677235 Author: Henri Vasserman <[email protected]> Date: Sat Apr 22 23:28:00 2023 +0300 More build file changes commit d3e1984 Author: Henri Vasserman <[email protected]> Date: Fri Apr 21 03:32:06 2023 +0300 add rpath commit 0e005f7 Author: Henri Vasserman <[email protected]> Date: Fri Apr 21 02:13:00 2023 +0300 Build file changes Now HIP Clang is not required, the CMake scripts will configure the needed compiler, which can be system clang++. Also other code can still use GCC, but CMake will force the clang to link. commit 54a63c1 Author: Henri Vasserman <[email protected]> Date: Thu Apr 20 22:19:22 2023 +0300 Update Makefile for the Cuda kernels commit 0fd8363 Author: Henri Vasserman <[email protected]> Date: Thu Apr 20 02:04:00 2023 +0300 use hipblas based on cublas

* use hipblas based on cublas * Update Makefile for the Cuda kernels * Expand arch list and make it overrideable * Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5) * add hipBLAS to README * new build arg LLAMA_CUDA_MMQ_Y * fix half2 decomposition * Add intrinsics polyfills for AMD * AMD assembly optimized __dp4a * Allow overriding CC_TURING * use "ROCm" instead of "CUDA" * ignore all build dirs * Add Dockerfiles * fix llama-bench * fix -nommq help for non CUDA/HIP --------- Co-authored-by: YellowRoseCx <[email protected]> Co-authored-by: ardfork <[email protected]> Co-authored-by: funnbot <[email protected]> Co-authored-by: Engininja2 <[email protected]> Co-authored-by: Kerfuffle <[email protected]> Co-authored-by: jammm <[email protected]> Co-authored-by: jdecourval <[email protected]>

* koboldcpp-ROCm Port commit 3416c98 Merge: 5eb17f0 4c4e435 Author: YellowRoseCx <[email protected]> Date: Fri Aug 25 13:46:56 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit 5eb17f0 Author: YellowRoseCx <[email protected]> Date: Fri Aug 25 13:38:21 2023 -0500 ROCm Port update * use hipblas based on cublas * Update Makefile for the Cuda kernels * Expand arch list and make it overrideable * Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5) * add hipBLAS to README * new build arg LLAMA_CUDA_MMQ_Y * fix half2 decomposition * Add intrinsics polyfills for AMD * AMD assembly optimized __dp4a * Allow overriding CC_TURING * use "ROCm" instead of "CUDA" * ignore all build dirs * Add Dockerfiles * fix llama-bench * fix -nommq help for non CUDA/HIP --------- Co-Authored-By: YellowRoseCx <[email protected]> Co-Authored-By: ardfork <[email protected]> Co-Authored-By: funnbot <[email protected]> Co-Authored-By: Engininja2 <[email protected]> Co-Authored-By: Kerfuffle <[email protected]> Co-Authored-By: jammm <[email protected]> Co-Authored-By: jdecourval <[email protected]> commit b34f4bd Author: YellowRoseCx <[email protected]> Date: Sat Aug 19 17:12:52 2023 -0500 Update README.md commit 7d11961 Author: YellowRoseCx <[email protected]> Date: Mon Aug 14 23:03:12 2023 -0500 remove force DMMV commit cd61aa0 Author: YellowRoseCx <[email protected]> Date: Sat Aug 12 17:24:31 2023 -0500 restore main_gpu parameter commit 4a042f3 Author: Henri Vasserman <[email protected]> Date: Sat Aug 12 10:51:46 2023 +0300 gfx1100 support --------- Co-authored-by: ardfork <[email protected]> Co-authored-by: jammm <[email protected]> Co-authored-by: jdecourval <[email protected]> commit 8913bc6 Author: Henri Vasserman <[email protected]> Date: Fri Aug 11 10:16:02 2023 +0300 Allow overriding CC_TURING commit e77a4c3 Author: Henri Vasserman <[email protected]> Date: Fri Aug 11 10:00:07 2023 +0300 Merge 'origin/master' into hipblas commit cc4c4e3 Author: Engininja2 <[email protected]> Date: Fri Aug 11 09:43:14 2023 +0300 New __dp4a assembly Now compatible with gfx900 and faster as well. commit 1a03b70 Author: Henri Vasserman <[email protected]> Date: Fri Aug 11 09:30:28 2023 +0300 Undo mess --------- Co-authored-by: ardfork <[email protected]> commit 4366ff9 Author: DannyDaemonic <[email protected]> Date: Thu Aug 10 13:11:36 2023 -0700 Handle `ENABLE_VIRTUAL_TERMINAL_PROCESSING` more gracefully on earlier versions of Windows. commit 811ff85 Author: Christian Demsar <[email protected]> Date: Thu Aug 10 10:28:27 2023 -0400 Add --n-predict -2 for stopping generation on full context (ggml-org#2565) commit 37c9717 Author: Martin Krasser <[email protected]> Date: Thu Aug 10 12:16:38 2023 +0200 Fix grammar-based sampling issue in server (ggml-org#2566) commit d18ecd5 Author: YellowRoseCx <[email protected]> Date: Thu Aug 10 13:19:41 2023 -0500 make mmq gen faster for amd commit 243894a Author: Henri Vasserman <[email protected]> Date: Thu Aug 10 12:14:40 2023 +0300 ws fix commit ac2f14d Author: Engininja2 <[email protected]> Date: Thu Aug 10 12:11:27 2023 +0300 AMD assembly optimized __dp4a Doesn't seem to work for gfx900, so commented out. commit 9dba0c9 Author: Henri Vasserman <[email protected]> Date: Thu Aug 10 12:09:28 2023 +0300 Fix merge --------- Co-authored-by: ardfork <[email protected]> Co-authored-by: Kerfuffle <[email protected]> commit f570b5c Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 22:11:20 2023 -0500 Revert "revert cuda changes as they are bugggy" This reverts commit 1541bf8. commit 1541bf8 Author: Concedo <[email protected]> Date: Wed Aug 9 22:36:41 2023 +0800 revert cuda changes as they are bugggy commit bacc202 Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 20:37:17 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit b7cb4cf Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 20:00:52 2023 -0500 additional fixes commit fadae72 Merge: 518eb2a 8f8ab6c Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 18:45:50 2023 -0500 Merge branch 'hipblas' into develop4Main commit 518eb2a Merge: bda0215 cae6a84 Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 18:32:10 2023 -0500 Merge remote-tracking branch 'upstream/concedo' into develop2Main commit bda0215 Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 18:17:54 2023 -0500 update makefile to multisystem path commit 8f8ab6c Author: YellowRoseCx <[email protected]> Date: Wed Aug 9 18:05:03 2023 -0500 hipLDFLAG Path change Unix to multisystem in Makefile changed the hardcoded linux distro hipblas LD path from -L/opt/rocm/lib to use the defined ROCM_PATH variable to be flexible with ROCm on non-Linux OS commit 610ba4c Merge: 4024f91 25d43e0 Author: Henri Vasserman <[email protected]> Date: Wed Aug 9 23:54:58 2023 +0300 Merge 'origin/master' into hipblas commit 4024f91 Author: Henri Vasserman <[email protected]> Date: Wed Aug 9 01:56:44 2023 +0300 Add intrinsics polyfills for AMD --------- Co-authored-by: ardfork <[email protected]> Co-authored-by: funnbot <[email protected]> Co-authored-by: Engininja2 <[email protected]> commit ab62128 Merge: d91456a f5bfea0 Author: Henri Vasserman <[email protected]> Date: Wed Aug 9 00:37:01 2023 +0300 Merge 'origin/master' into hipblas commit ee9fa2a Author: YellowRoseCx <[email protected]> Date: Wed Aug 2 01:53:58 2023 -0500 Update Makefile commit d91456a Author: ardfork <[email protected]> Date: Mon Jul 31 20:35:00 2023 +0300 fix half2 decomposition commit c1cb70d Author: Henri Vasserman <[email protected]> Date: Mon Jul 31 19:56:44 2023 +0300 new build arg LLAMA_CUDA_MMQ_Y commit c1664a0 Merge: 4336231 0728c5a Author: Henri Vasserman <[email protected]> Date: Mon Jul 31 19:32:27 2023 +0300 Merge 'origin/master' into hipblas commit 848558d Author: YellowRoseCx <[email protected]> Date: Sun Jul 30 20:02:52 2023 -0500 import vars logic fix commit b650b84 Author: YellowRoseCx <[email protected]> Date: Sun Jul 30 00:21:36 2023 -0500 Update easy_KCPP-ROCm_install.sh commit 8573a67 Author: YellowRoseCx <[email protected]> Date: Sat Jul 29 21:31:12 2023 -0500 remove duplicate code and fix typo remove duplicate tooltip commit 430986e Author: YellowRoseCx <[email protected]> Date: Sat Jul 29 21:07:34 2023 -0500 hide "missing" if all are built move tooltip functions to helper functions section. hides the string "Missing: ..." from showing if all backends are available " if len(runopts)==6 else + " commit dd0db72 Author: YellowRoseCx <[email protected]> Date: Sat Jul 29 20:52:31 2023 -0500 hide "missing" if all are built move tooltip functions to helper functions section. hides the string "Missing: ..." from showing if all backends are available commit 43fffb6 Merge: 0ed65a4 b40550c Author: YellowRoseCx <[email protected]> Date: Sat Jul 29 19:13:15 2023 -0500 Merge branch 'concedo' commit 0ed65a4 Author: YellowRoseCx <[email protected]> Date: Sat Jul 29 18:34:21 2023 -0500 Hide unavailable backends & Add tooltip over backend count Hides unavailable backends from the user and if the program is launched without any backends made, it shows an error message to them stating no backends were found and to make them using the 'make' command Add tooltip when hovering over backend count label hovering over the new label that shows the backend count will explain what the numbers are, and show the users which backends are not available or built commit 2a26398 Merge: cee2e9d 31486eb Author: YellowRoseCx <[email protected]> Date: Sat Jul 29 15:16:33 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit 4336231 Author: Henri Vasserman <[email protected]> Date: Sat Jul 29 18:35:56 2023 +0300 add hipBLAS to README --------- Co-authored-by: ardfork <[email protected]> commit f8e3fc6 Author: Henri Vasserman <[email protected]> Date: Sat Jul 29 14:16:46 2023 +0300 rocblas init stuff commit d2ade63 Merge: cde52d6 8a88e58 Author: Henri Vasserman <[email protected]> Date: Sat Jul 29 12:59:48 2023 +0300 Merge 'origin/master' into hipblas commit cee2e9d Author: YellowRoseCx <[email protected]> Date: Wed Jul 26 23:36:55 2023 -0500 Only Show Available Backends in GUI Hides unavailable backends from the user and if the program is launched without any backends made, it shows an error message to them stating no backends were found and to make them using the 'make' command commit 7863610 Author: YellowRoseCx <[email protected]> Date: Wed Jul 26 13:27:22 2023 -0500 Update easy_KCPP-ROCm_install.sh commit 731cd6e Author: YellowRoseCx <[email protected]> Date: Tue Jul 25 22:39:50 2023 -0500 Create easy_rocm_install.sh commit f154685 Merge: cbdc1f3 94e0a06 Author: YellowRoseCx <[email protected]> Date: Tue Jul 25 22:25:10 2023 -0500 Merge branch 'concedo_experimentalMAIN' commit cbdc1f3 Merge: 5b838d4 9731682 Author: YellowRoseCx <[email protected]> Date: Mon Jul 24 16:53:21 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit cde52d6 Merge: 8e8054a 84e09a7 Author: Henri Vasserman <[email protected]> Date: Mon Jul 24 12:22:58 2023 +0300 Merge 'origin/master' into hipblas commit 8e8054a Author: Henri Vasserman <[email protected]> Date: Mon Jul 24 12:20:49 2023 +0300 Add rocblas to build files commit 1f6294d Author: YellowRoseCx <[email protected]> Date: Mon Jul 24 03:52:01 2023 -0500 Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5) * initialize rocblas commit 5b838d4 Author: YellowRoseCx <[email protected]> Date: Mon Jul 24 03:10:35 2023 -0500 amd multigpu full layer offload w/o vram scratch commit 9bfb2fd Merge: b379f9d 66328fc Author: YellowRoseCx <[email protected]> Date: Mon Jul 24 03:07:44 2023 -0500 Merge branch 'concedo_experimental' commit b379f9d Author: YellowRoseCx <[email protected]> Date: Mon Jul 24 03:07:00 2023 -0500 Revert "amd multigpu full layer offload w/o vram scratch" This reverts commit 9adfc8e. commit 9adfc8e Author: YellowRoseCx <[email protected]> Date: Mon Jul 24 02:56:40 2023 -0500 amd multigpu full layer offload w/o vram scratch commit 05c792e Author: YellowRoseCx <[email protected]> Date: Mon Jul 24 00:18:48 2023 -0500 initialize rocblas commit ade68d0 Merge: 521ad6b 56995ca Author: YellowRoseCx <[email protected]> Date: Sun Jul 23 20:25:05 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit 521ad6b Author: YellowRoseCx <[email protected]> Date: Thu Jul 20 21:42:33 2023 -0500 lazy import_var error handling for saves commit 9553e52 Merge: cac6650 f036109 Author: YellowRoseCx <[email protected]> Date: Thu Jul 20 19:59:41 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit cac6650 Author: YellowRoseCx <[email protected]> Date: Mon Jul 17 23:05:02 2023 -0500 Makefile fix! Allows hip/clblast build together commit 3db70b5 Merge: 2ec4466 7568d1a Author: Henri Vasserman <[email protected]> Date: Tue Jul 18 01:54:17 2023 +0300 Merge 'origin/master' into hipblas commit f208670 Author: YellowRoseCx <[email protected]> Date: Fri Jul 14 02:56:03 2023 -0500 improve error handling with gpu names commit 860e738 Author: YellowRoseCx <[email protected]> Date: Fri Jul 14 00:33:03 2023 -0500 Show GPU names in GUI, Only show GPUs that exist changed the pre-set 1,2,3 and 1,2,3,all settings that the GPU selector had and replaced them with a function that grabs the GPU names and sets the names as the values for the selector boxes. commit 2ec4466 Author: Henri Vasserman <[email protected]> Date: Thu Jul 13 13:44:02 2023 +0300 Update build flags. GGML_CUDA_DMMV_Y is now GGML_CUDA_MMV_Y so update your build instructions. GGML_CUDA_FORCE_DMMV is always enabled. --------- Co-authored-by: YellowRoseCx <[email protected]> commit cd36b18 Merge: afcb8fe 1cbf561 Author: Henri Vasserman <[email protected]> Date: Thu Jul 13 13:03:01 2023 +0300 Merge 'origin/master' into hipblas commit ac7ebc3 Author: YellowRoseCx <[email protected]> Date: Wed Jul 12 18:32:18 2023 -0500 add hipBLAS name scheme to GUI and update README commit 7f85cc5 Author: YellowRoseCx <[email protected]> Date: Wed Jul 12 17:35:54 2023 -0500 update makefile and ggml.c commit 6ca3499 Author: YellowRoseCx <[email protected]> Date: Wed Jul 12 15:43:45 2023 -0500 ggml.c fix commit 770e674 Merge: 2b289cd 5941514 Author: YellowRoseCx <[email protected]> Date: Wed Jul 12 15:24:36 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit 2b289cd Author: YellowRoseCx <[email protected]> Date: Wed Jul 12 14:30:00 2023 -0500 Update c-cpp.yml commit 5dae95a Author: YellowRoseCx <[email protected]> Date: Wed Jul 12 14:28:51 2023 -0500 Update c-cpp.yml commit b37cd73 Author: YellowRoseCx <[email protected]> Date: Wed Jul 12 14:27:04 2023 -0500 Create c-cpp.yml to test Actions commit afcb8fe Author: Henri Vasserman <[email protected]> Date: Tue Jul 11 18:09:27 2023 +0300 Add new config option commit 8c2c497 Merge: e610466 2347463 Author: Henri Vasserman <[email protected]> Date: Tue Jul 11 17:53:54 2023 +0300 Merge 'origin/master' into hipblas commit e610466 Author: Henri Vasserman <[email protected]> Date: Tue Jul 11 17:53:14 2023 +0300 Expand arch list and make it overrideable commit 80e4e54 Merge: 7735c5a 1d16309 Author: Henri Vasserman <[email protected]> Date: Mon Jul 10 02:09:28 2023 +0300 Merge 'origin/master' into hipblas commit 8432e9d Author: YellowRoseCx <[email protected]> Date: Sun Jul 9 16:55:30 2023 -0500 Update Makefile commit b58c189 Author: YellowRoseCx <[email protected]> Date: Sun Jul 9 16:20:00 2023 -0500 Add multi-gpu CuBLAS support to new GUI commit 0c1c71b Author: YellowRoseCx <[email protected]> Date: Sat Jul 8 07:56:57 2023 -0500 Update Makefile commit f864f60 Author: Johannes Gäßler <[email protected]> Date: Sat Jul 8 00:25:15 2023 +0200 CUDA: add __restrict__ to mul mat vec kernels (ggml-org#2140) commit 4539bc2 Author: YellowRoseCx <[email protected]> Date: Sat Jul 8 01:36:14 2023 -0500 update makefile for changes commit 912e31e Merge: 74e2703 ddaa4f2 Author: YellowRoseCx <[email protected]> Date: Fri Jul 7 23:15:37 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit 74e2703 Merge: cf65429 f9108ba Author: YellowRoseCx <[email protected]> Date: Wed Jul 5 15:16:49 2023 -0500 Merge branch 'LostRuins:concedo' into main commit 7735c5a Merge: c3e3733 7ee76e4 Author: Henri Vasserman <[email protected]> Date: Tue Jul 4 17:09:16 2023 +0300 Merge 'origin/master' into hipblas commit cf65429 Author: YellowRoseCx <[email protected]> Date: Mon Jul 3 16:56:40 2023 -0500 print cuda or opencl based on what's used commit 72c16d2 Author: YellowRoseCx <[email protected]> Date: Mon Jul 3 16:45:39 2023 -0500 Revert "fix my mistake that broke other arches" This reverts commit 777aed5. commit 777aed5 Author: YellowRoseCx <[email protected]> Date: Mon Jul 3 15:53:32 2023 -0500 fix my mistake that broke other arches commit 27780a9 Author: YellowRoseCx <[email protected]> Date: Sun Jul 2 16:03:27 2023 -0500 rocm fixes commit f52c7d4 Author: YellowRoseCx <[email protected]> Date: Sun Jul 2 16:02:58 2023 -0500 Revert "rocm fixes" This reverts commit 2fe9927. commit 2fe9927 Author: YellowRoseCx <[email protected]> Date: Sun Jul 2 15:58:21 2023 -0500 rocm fixes commit efe7560 Author: YellowRoseCx <[email protected]> Date: Sun Jul 2 15:55:43 2023 -0500 Revert "move HIPBLAS definitions into ggml-cuda.h" This reverts commit bf49a93. commit 4fc0181 Author: YellowRoseCx <[email protected]> Date: Sun Jul 2 15:55:36 2023 -0500 Revert "move hipblas definitions to header files" This reverts commit 2741ffb. commit 89eb576 Merge: 2741ffb 3d2907d Author: YellowRoseCx <[email protected]> Date: Sun Jul 2 14:44:13 2023 -0500 Merge branch 'LostRuins:concedo' into main commit c3e3733 Author: Henri Vasserman <[email protected]> Date: Sun Jul 2 15:51:31 2023 +0300 ROCm fixes commit 15db19a Merge: 04419f1 46088f7 Author: Henri Vasserman <[email protected]> Date: Sun Jul 2 15:39:57 2023 +0300 Merge 'origin/master' into hipblas commit 2741ffb Author: YellowRoseCx <[email protected]> Date: Sat Jul 1 17:07:42 2023 -0500 move hipblas definitions to header files commit bf49a93 Author: YellowRoseCx <[email protected]> Date: Sat Jul 1 16:38:50 2023 -0500 move HIPBLAS definitions into ggml-cuda.h commit 540f4e0 Merge: 2c3b46f eda663f Author: YellowRoseCx <[email protected]> Date: Sat Jul 1 14:58:32 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit 2c3b46f Author: YellowRoseCx <[email protected]> Date: Thu Jun 29 18:43:43 2023 -0500 changes to fix build commit c9e1103 Author: YellowRoseCx <[email protected]> Date: Thu Jun 29 18:20:07 2023 -0500 Update ggml_v2-cuda-legacy.cu for ROCM commit b858fc5 Author: YellowRoseCx <[email protected]> Date: Thu Jun 29 17:49:39 2023 -0500 changes to work with upstream commit 69a0c25 Merge: 096f0b0 1347d3a Author: YellowRoseCx <[email protected]> Date: Thu Jun 29 16:59:06 2023 -0500 Merge remote-tracking branch 'upstream/concedo' commit 04419f1 Merge: bb16eff d3494bb Author: Henri Vasserman <[email protected]> Date: Wed Jun 28 23:30:10 2023 +0300 Merge 'origin/master' into hipblas commit bb16eff Author: YellowRoseCx <[email protected]> Date: Wed Jun 28 15:27:10 2023 -0500 headers fix; add kquants_iter for hipblas and add gfx803 (#1) * kquants_iter for hipblas and add gfx803 * Update CMakeLists.txt with hipblas kquants_iter and DMMV_F16 * remove dmmv_f16 for now commit 096f0b0 Author: YellowRoseCx <[email protected]> Date: Wed Jun 28 15:27:02 2023 -0500 revert unnecessary hipblas conditionals commit d81e81a Author: YellowRoseCx <[email protected]> Date: Wed Jun 28 14:48:23 2023 -0500 Update Makefile hipblas nvcc correction commit c8ae945 Merge: c1e5c83 0be54f7 Author: Henri Vasserman <[email protected]> Date: Tue Jun 27 10:50:37 2023 +0300 Merge 'origin/master' into hipblas commit 2579ecf Merge: abed427 d2034ce Author: YellowRoseCx <[email protected]> Date: Sun Jun 25 17:50:04 2023 -0500 Merge branch 'LostRuins:concedo' into main commit c1e5c83 Merge: 35a6031 447ccbe Author: Henri Vasserman <[email protected]> Date: Sun Jun 25 21:40:05 2023 +0300 Merge 'origin/master' into hipblas commit 35a6031 Merge: df7346c 66a2555 Author: Henri Vasserman <[email protected]> Date: Sun Jun 25 10:57:48 2023 +0300 Merge 'origin/master' into hipblas commit abed427 Author: YellowRoseCx <[email protected]> Date: Sat Jun 24 19:16:30 2023 -0500 reorganize If statements to include proper headers commit 06c3bf0 Merge: ea6d320 8342fe8 Author: YellowRoseCx <[email protected]> Date: Sat Jun 24 16:57:20 2023 -0500 Merge branch 'LostRuins:concedo' into main commit ea6d320 Author: YellowRoseCx <[email protected]> Date: Fri Jun 23 01:53:28 2023 -0500 Update README.md commit 4d56ad8 Author: YellowRoseCx <[email protected]> Date: Thu Jun 22 16:19:43 2023 -0500 Update README.md commit 21f9308 Author: YellowRoseCx <[email protected]> Date: Thu Jun 22 15:42:05 2023 -0500 kquants_iter for hipblas and add gfx803 commit df7346c Merge: 5dd2fbe 7487137 Author: Henri Vasserman <[email protected]> Date: Thu Jun 22 20:51:09 2023 +0300 Merge 'origin/master' into hipblas commit b6ff890 Merge: eb094f0 e6ddb15 Author: YellowRoseCx <[email protected]> Date: Thu Jun 22 12:42:09 2023 -0500 Merge branch 'LostRuins:concedo' into main commit eb094f0 Author: YellowRoseCx <[email protected]> Date: Wed Jun 21 23:59:18 2023 -0500 lowvram parameter description commit 3a5dfeb Merge: 665cc11 b1f00fa Author: YellowRoseCx <[email protected]> Date: Wed Jun 21 16:53:03 2023 -0500 Merge branch 'LostRuins:concedo' into koboldcpp-rocm commit 665cc11 Author: YellowRoseCx <[email protected]> Date: Wed Jun 21 01:13:19 2023 -0500 add lowvram parameter commit 222cbbb Author: YellowRoseCx <[email protected]> Date: Tue Jun 20 19:03:28 2023 -0500 add additional hipblas conditions for cublas commit e1f9581 Author: YellowRoseCx <[email protected]> Date: Tue Jun 20 16:51:59 2023 -0500 Add hip def for cuda v2 commit 3bff5c0 Merge: a7e74b3 266d47a Author: YellowRoseCx <[email protected]> Date: Tue Jun 20 13:38:06 2023 -0500 Merge branch 'LostRuins:concedo' into koboldcpp-rocm commit a7e74b3 Author: YellowRoseCx <[email protected]> Date: Mon Jun 19 22:04:18 2023 -0500 Update README.md commit 5e99b3c Author: YellowRoseCx <[email protected]> Date: Mon Jun 19 22:03:42 2023 -0500 Update Makefile commit 9190b17 Author: YellowRoseCx <[email protected]> Date: Mon Jun 19 21:47:10 2023 -0500 Update README.md commit 5dd2fbe Merge: 67e229b 20568fe Author: Henri Vasserman <[email protected]> Date: Tue Jun 20 01:23:12 2023 +0300 Merge 'origin/master' into hipblas commit 2780ea2 Author: YellowRoseCx <[email protected]> Date: Sun Jun 18 15:48:00 2023 -0500 Update Makefile commit 04a3e64 Author: YellowRoseCx <[email protected]> Date: Sun Jun 18 14:33:39 2023 -0500 remove extra line commit cccbca9 Author: YellowRoseCx <[email protected]> Date: Sun Jun 18 14:31:17 2023 -0500 attempt adding ROCM hipblas commit a44a1d4 Author: YellowRoseCx <[email protected]> Date: Sun Jun 18 14:31:01 2023 -0500 attempt adding ROCM hipblas commit b088184 Author: YellowRoseCx <[email protected]> Date: Sun Jun 18 14:30:54 2023 -0500 attempt adding ROCM hipblas commit 67e229b Merge: 6f7c156 b241649 Author: Henri Vasserman <[email protected]> Date: Sun Jun 18 00:36:54 2023 +0300 Merge 'origin/master' into hipblas commit 6f7c156 Merge: 61df8e9 fc45a81 Author: Henri Vasserman <[email protected]> Date: Sat Jun 17 16:53:22 2023 +0300 Merge 'origin/master' into hipblas commit 61df8e9 Author: Henri Vasserman <[email protected]> Date: Wed Jun 14 22:46:10 2023 +0300 add cudaMemset commit a836529 Merge: 85f902d 254a7a7 Author: Henri Vasserman <[email protected]> Date: Wed Jun 14 22:41:55 2023 +0300 Merge 'origin/master' into hipblas commit 85f902d Merge: 4362e80 b50b570 Author: Henri Vasserman <[email protected]> Date: Thu Jun 8 10:50:28 2023 +0300 Merge 'origin/master' into hipblas commit 4362e80 Merge: fa5b3d7 17366df Author: Henri Vasserman <[email protected]> Date: Tue Jun 6 23:14:40 2023 +0300 Merge 'origin/master' into hipblas commit fa5b3d7 Author: Henri Vasserman <[email protected]> Date: Tue Jun 6 18:47:00 2023 +0300 fix makefile. commit 1ba4ce4 Author: Henri Vasserman <[email protected]> Date: Tue Jun 6 18:41:08 2023 +0300 Revert "warp size fixes" It seems like 32 is faster for me, at least and it won't cause so many conflicts. This reverts commit 5d6eb72. commit 5d6eb72 Author: Henri Vasserman <[email protected]> Date: Tue Jun 6 18:32:41 2023 +0300 warp size fixes commit 33091a9 Merge: 9fdaa1d 2d43387 Author: Henri Vasserman <[email protected]> Date: Tue Jun 6 16:19:23 2023 +0300 Merge 'origin/master' into hipblas commit 9fdaa1d Author: Henri Vasserman <[email protected]> Date: Sat May 27 19:17:53 2023 +0300 Add more defs For forward compatibility LostRuins#1607 commit a4648c1 Merge: 4c8b3fb 0ecb1bb Author: Henri Vasserman <[email protected]> Date: Sat May 27 18:22:39 2023 +0300 Merge 'origin/master' into hipblas commit 4c8b3fb Author: Henri Vasserman <[email protected]> Date: Fri May 26 01:08:53 2023 +0300 add configurable vars commit 30d921a Author: Henri Vasserman <[email protected]> Date: Fri May 26 01:03:56 2023 +0300 and makefile commit a593a4f Author: Henri Vasserman <[email protected]> Date: Fri May 26 00:55:28 2023 +0300 Add missing parameters commit 174bf6a Merge: f80ce7a 1fcdcc2 Author: Henri Vasserman <[email protected]> Date: Fri May 26 00:44:23 2023 +0300 Merge 'origin/master' into hipblas commit f80ce7a Merge: 600ace3 ac7876a Author: Henri Vasserman <[email protected]> Date: Thu May 25 00:02:50 2023 +0300 Merge branch 'origin/master' into hipblas commit 600ace3 Author: Henri Vasserman <[email protected]> Date: Sat May 20 23:42:20 2023 +0300 update warp size commit b19fefe Author: Henri Vasserman <[email protected]> Date: Sat May 20 23:28:08 2023 +0300 Forwardcompat commit c66115b Merge: a0b2d5f b8ee340 Author: Henri Vasserman <[email protected]> Date: Sat May 20 18:29:31 2023 +0300 Merge 'origin/master' into hipblas commit a0b2d5f Merge: 8bab456 2a5ee02 Author: Henri Vasserman <[email protected]> Date: Tue May 16 17:08:29 2023 +0300 Merge 'origin/master' into hipblas commit 8bab456 Merge: 2956630 b5c9295 Author: Henri Vasserman <[email protected]> Date: Mon May 15 00:01:12 2023 +0300 Merge 'origin/master' into hipblas commit 2956630 Merge: 0fe6384 f048af0 Author: Henri Vasserman <[email protected]> Date: Sat May 13 13:12:52 2023 +0300 Merge 'origin/master' into hipblas commit 0fe6384 Author: Henri Vasserman <[email protected]> Date: Fri May 12 17:22:11 2023 +0300 fix makefile commit 605560d Merge: 127f68e 089b1c9 Author: Henri Vasserman <[email protected]> Date: Fri May 12 16:12:53 2023 +0300 Merge 'origin/master' into hipblas commit 127f68e Merge: 070cbcc b608b55 Author: Henri Vasserman <[email protected]> Date: Thu May 11 20:21:27 2023 +0300 Merge 'origin/master' into hipblas commit 070cbcc Author: Henri Vasserman <[email protected]> Date: Sun May 7 18:10:56 2023 +0300 occupanct function commit a3296d5 Merge: 0aefa6a e129551 Author: Henri Vasserman <[email protected]> Date: Sun May 7 18:06:04 2023 +0300 Merge 'origin/master' into hipblas commit 0aefa6a Merge: baeb482 1b0fd45 Author: Henri Vasserman <[email protected]> Date: Sun May 7 12:24:41 2023 +0300 Merge 'origin/master' into hipblas commit baeb482 Author: Henri Vasserman <[email protected]> Date: Sun May 7 12:24:12 2023 +0300 Revert to default copy commit 289073a Merge: 1107194 173d0e6 Author: Henri Vasserman <[email protected]> Date: Sat May 6 19:59:41 2023 +0300 Merge 'origin/master' into hipblas commit 1107194 Merge: 04c0d48 a3b85b2 Author: Henri Vasserman <[email protected]> Date: Sat May 6 00:38:20 2023 +0300 Merge 'origin/master' into hipblas commit 04c0d48 Author: Henri Vasserman <[email protected]> Date: Thu May 4 12:31:16 2023 +0300 Move all HIP stuff to ggml-cuda.cu commit d83cfba Merge: b67cc50 799fdc1 Author: Henri Vasserman <[email protected]> Date: Thu May 4 11:31:16 2023 +0300 Merge 'origin/master' into hipblas commit b67cc50 Merge: fcbc262 e216aa0 Author: Henri Vasserman <[email protected]> Date: Wed May 3 15:04:51 2023 +0300 Merge 'origin/master' into hipblas commit fcbc262 Merge: c73def1 f4cef87 Author: Henri Vasserman <[email protected]> Date: Mon May 1 22:45:29 2023 +0300 Merge 'origin/master' into hipblas commit c73def1 Merge: d8ea75e f0d70f1 Author: Henri Vasserman <[email protected]> Date: Sun Apr 30 18:40:42 2023 +0300 Merge 'origin/master' into hipblas commit d8ea75e Merge: d194586 334637e Author: Henri Vasserman <[email protected]> Date: Sat Apr 29 11:25:51 2023 +0300 Merge 'origin/master' into hipblas commit d194586 Merge: 2ab9d11 7f15c5c Author: Henri Vasserman <[email protected]> Date: Fri Apr 28 23:03:52 2023 +0300 Merge 'origin/master' into hipblas commit 2ab9d11 Merge: 3b4a531 04aaae1 Author: Henri Vasserman <[email protected]> Date: Fri Apr 28 16:30:05 2023 +0300 Merge 'origin/master' into hipblas commit 3b4a531 Merge: a1caa48 0b2da20 Author: Henri Vasserman <[email protected]> Date: Fri Apr 28 10:08:41 2023 +0300 Merge 'origin/master' into hipblas commit a1caa48 Author: Henri Vasserman <[email protected]> Date: Fri Apr 28 10:08:21 2023 +0300 add more cuda defines This is so 'slaren/cuda-f16f32' would merge. commit ecc0565 Author: Henri Vasserman <[email protected]> Date: Fri Apr 28 01:58:27 2023 +0300 only .cu file needs to be complied as device commit ef51e9e Merge: d571d16 4afcc37 Author: Henri Vasserman <[email protected]> Date: Wed Apr 26 12:46:26 2023 +0300 Merge branch 'ggerganov:master' into hipblas commit d571d16 Merge: 608aa33 dd0eabc Author: Henri Vasserman <[email protected]> Date: Tue Apr 25 21:15:33 2023 +0300 Merge 'origin/master' into hipblas commit 608aa33 Author: Henri Vasserman <[email protected]> Date: Tue Apr 25 21:15:04 2023 +0300 change default GPU arch to match CMake commit 3a004b2 Author: Henri Vasserman <[email protected]> Date: Mon Apr 24 02:24:54 2023 +0300 add rpath commit db7a012 Merge: 3677235 284685f Author: Henri Vasserman <[email protected]> Date: Sun Apr 23 21:49:28 2023 +0300 Merge 'origin/master' into hipblas commit 3677235 Author: Henri Vasserman <[email protected]> Date: Sat Apr 22 23:28:00 2023 +0300 More build file changes commit d3e1984 Author: Henri Vasserman <[email protected]> Date: Fri Apr 21 03:32:06 2023 +0300 add rpath commit 0e005f7 Author: Henri Vasserman <[email protected]> Date: Fri Apr 21 02:13:00 2023 +0300 Build file changes Now HIP Clang is not required, the CMake scripts will configure the needed compiler, which can be system clang++. Also other code can still use GCC, but CMake will force the clang to link. commit 54a63c1 Author: Henri Vasserman <[email protected]> Date: Thu Apr 20 22:19:22 2023 +0300 Update Makefile for the Cuda kernels commit 0fd8363 Author: Henri Vasserman <[email protected]> Date: Thu Apr 20 02:04:00 2023 +0300 use hipblas based on cublas * Merge Fixes * readme merge fix * remove old ggmlv2 changes * bring ggml v2_cuda up to date with AMD changes * Revert ggml v2_cuda changes BC they werent needed This reverts commit 3385dd4. * avoid launching subprocesses to get device names for now, but other than that seems to be working --------- Co-authored-by: Concedo <[email protected]>

* main : don't print special tokens with --grammar The CLI interface was recently changed to print special control tokens like the </s> stop message one. This token shouldn't be printed if the grammar flag was passed, unless the grammar specifies it, because that breaks shell-scriptability. * main: use seperate stream for control characters * main: use dprintf and add --ctrl-token-no-out and --ctrl-token-fd-out * main: dprintf isn't part of the IEEE POSIX standard. Just use write(). * main: remove --ctrl-token-fd-out in favor for fcntl() based detection * common.cpp: accidentally removed --interactive-first * main: only merge stdout and control token if not in conversation or grammar mode * main: rejig control token descriptor handling * main: must check pipe status on very top of program * main: renamed --no-special from --ctrl-token-no-out and other refactoring * main: refactor ctrl_token_no_out --> no_special * llama: rename llama_token_is_control_token() to llama_token_is_control() * main: remove special token file descriptor feature (#5) --------- Co-authored-by: Brian <[email protected]>

…gml-org#16038) Initalizing RESERVED_NAME in is_reserved_name() is not thread safe and leads to corrupted memory when used from multiple threads as can be seen in the asan trace below. This fixes the initialization to make it thread-safe. #0 0x000100abd018 in std::__1::pair<std::__1::__hash_iterator<std::__1::__hash_node<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, void*>*>, bool> std::__1::__hash_table<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::__emplace_unique_key_args<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) __hash_table:1565 YellowRoseCx#1 0x000100ab0320 in SchemaConverter::visit(nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) json-schema-to-grammar.cpp:802 YellowRoseCx#2 0x000100aafc48 in std::__1::__function::__func<build_grammar(std::__1::function<void (common_grammar_builder const&)> const&, common_grammar_options const&)::$_2, std::__1::allocator<build_grammar(std::__1::function<void (common_grammar_builder const&)> const&, common_grammar_options const&)::$_2>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&)>::operator()(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&) function.h:319 YellowRoseCx#3 0x000100a2c938 in std::__1::__function::__func<common_chat_params_init_llama_3_x(minja::chat_template const&, templates_params const&, bool)::$_0::operator()(common_grammar_builder const&) const::'lambda'(nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&), std::__1::allocator<common_chat_params_init_llama_3_x(minja::chat_template const&, templates_params const&, bool)::$_0::operator()(common_grammar_builder const&) const::'lambda'(nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&)>, void (nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&)>::operator()(nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&) function.h:319 YellowRoseCx#4 0x000100a139f8 in foreach_function(nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&, std::__1::function<void (nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&)> const&) chat.cpp:762 YellowRoseCx#5 0x000100a2a7f4 in std::__1::__function::__func<common_chat_params_init_llama_3_x(minja::chat_template const&, templates_params const&, bool)::$_0, std::__1::allocator<common_chat_params_init_llama_3_x(minja::chat_template const&, templates_params const&, bool)::$_0>, void (common_grammar_builder const&)>::operator()(common_grammar_builder const&) function.h:319 YellowRoseCx#6 0x000100aa98f4 in build_grammar(std::__1::function<void (common_grammar_builder const&)> const&, common_grammar_options const&) json-schema-to-grammar.cpp:982 YellowRoseCx#7 0x0001009c9314 in common_chat_params_init_llama_3_x(minja::chat_template const&, templates_params const&, bool) chat.cpp:1110 YellowRoseCx#8 0x0001009b8afc in common_chat_templates_apply_jinja(common_chat_templates const*, common_chat_templates_inputs const&) chat.cpp:1992 YellowRoseCx#9 0x0001009b533c in common_chat_templates_apply(common_chat_templates const*, common_chat_templates_inputs const&) chat.cpp:2074 YellowRoseCx#10 0x000100810120 in llamacpp_apply_chat_template+0x724 (predict_oai-98384e17fb94e863:arm64+0x100090120) ... ==45482==Register values: x[0] = 0x00006020004147f8 x[1] = 0x00006080000013c8 x[2] = 0x0000000000000000 x[3] = 0x0000604006289738 x[4] = 0x0000000000000002 x[5] = 0x0000000000000001 x[6] = 0x04034000004b4000 x[7] = 0x0000000000000001 x[8] = 0xbebebebebebebebe x[9] = 0x17d7d7d7d7d7d7d7 x[10] = 0x00000c04000828ff x[11] = 0x0000000000000001 x[12] = 0x000000002018d383 x[13] = 0x0000000000000000 x[14] = 0xfa0000000000fafa x[15] = 0x000010700001ffff x[16] = 0x000000019dc012c0 x[17] = 0x00000001021284f8 x[18] = 0x0000000000000000 x[19] = 0x00000001700acdc0 x[20] = 0x0000000000000002 x[21] = 0x000000002018d384 x[22] = 0x16dd16fd2e731151 x[23] = 0x0000007000020000 x[24] = 0x0000000100c69c08 x[25] = 0x0000000100c69c20 x[26] = 0x00006080000013c7 x[27] = 0x0000000100c69c00 x[28] = 0x00000001700acd60 fp = 0x00000001700aceb0 lr = 0x0000000100abce30 sp = 0x00000001700acd60 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV __hash_table:1565 in std::__1::pair<std::__1::__hash_iterator<std::__1::__hash_node<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, void*>*>, bool> std::__1::__hash_table<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::__emplace_unique_key_args<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) Thread T5 created by T0 here: #0 0x0001020b99d4 in pthread_create+0x5c (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x359d4) YellowRoseCx#1 0x000100873910 in std::sys::pal::unix::thread::Thread::new::h77254fdd87a28e05+0x118 (predict_oai-98384e17fb94e863:arm64+0x1000f3910) YellowRoseCx#2 0x0001007c7a1c in test::run_test::haeb3c2bcd5ed6cf6+0x76c (predict_oai-98384e17fb94e863:arm64+0x100047a1c) YellowRoseCx#3 0x0001007aedb0 in test::console::run_tests_console::he9d142d704f3a986+0x149c (predict_oai-98384e17fb94e863:arm64+0x10002edb0) YellowRoseCx#4 0x0001007c5758 in test::test_main::hf86a5e20735245b9+0x118 (predict_oai-98384e17fb94e863:arm64+0x100045758) YellowRoseCx#5 0x0001007c5da0 in test::test_main_static::h61ee9c8fd30abca0+0x54 (predict_oai-98384e17fb94e863:arm64+0x100045da0) ... ==45482==ABORTING

YellowRoseCx and others added 23 commits April 14, 2023 17:22

add wordstopper function

842a68f

add wordstopper function

544567e

Create wordstoppers.txt

80d7f86

cmake : add finding the OpenBLAS header file (LostRuins#992)

106faaf

benchmark : fix result validation in benchmark-q4_0-matmult (LostRuin…

c12b14b

…s#987)

Delete make_pyinstaller.bat

7df4922

Update koboldcpp.py

d2c7f2d

Add files via upload

be913aa

ggml : use posix_memalign on non-Windows env

aa485ce

Update README.md

4f07ad8

Update README.md

50d2815

Update wordstoppers.txt

e6756eb

Refactor ggml.c for future tensor types (LostRuins#1001)

0ad9646

Fix potential int8 overflow in non-SIMD vec_dot (LostRuins#986)

2f7c8e0

Update README.md

472ee3f

added wordstopper

4399c55

add wordstopper function

032180b

Merge branch 'base' into upstreamchanges

71dc2a5

Merge pull request #3 from ggerganov/master

8acd559

ggerganov master to upstreamchanges

Delete CMakeLists.txt

2bad163

Merge branch 'upstreamchanges' into ggerganov

1a43981

YellowRoseCx merged commit d43443e into concedo Apr 16, 2023

YellowRoseCx deleted the ggerganov branch April 21, 2023 21:09

YellowRoseCx added a commit that referenced this pull request Aug 9, 2023

Fix multi GPU on multiple amd architectures with rocblas_initialize() (…

1f6294d

…#5) * initialize rocblas

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ggerganov #5

Ggerganov #5

Uh oh!

YellowRoseCx commented Apr 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Ggerganov #5

Ggerganov #5

Uh oh!

Conversation

YellowRoseCx commented Apr 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants