llama_tokenize: too many tokens #92

ylchin · 2023-04-18T09:16:39Z

I was trying to run an alpaca model on a framework with a relatively large context window, and the following message keeps popping up:
llama_tokenize: too many tokens
how could i bypass this, and what is the maximum number of tokens in this case?

The text was updated successfully, but these errors were encountered:

gjmulder · 2023-05-12T17:31:44Z

I believe this may be a limit in llama.cpp. There was a discussion about having to increase some memory buffers to support larger context sizes.

abetlen · 2023-05-12T18:29:49Z

@gjmulder @ylchin fixed, you can now tokenize longer prompts, while this doesn't help with processing longer contexts it should help with chunking conversations / documents embeddings.

* Bugfix: Ensure logs are printed when streaming * Update llama.cpp * Update llama.cpp * Add missing tfs_z paramter * Bump version * Fix docker command * Revert "llama_cpp server: prompt is a string". Closes abetlen#187 This reverts commit b9098b0. * Only support generating one prompt at a time. * Allow model to tokenize strings longer than context length and set add_bos. Closes abetlen#92 * Update llama.cpp * Bump version * Update llama.cpp * Fix obscure Wndows DLL issue. Closes abetlen#208 * chore: add note for Mac m1 installation * Add winmode arg only on windows if python version supports it * Bump mkdocs-material from 9.1.11 to 9.1.12 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.1.11 to 9.1.12. - [Release notes](https://github.com/squidfunk/mkdocs-material/releases) - [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG) - [Commits](squidfunk/mkdocs-material@9.1.11...9.1.12) --- updated-dependencies: - dependency-name: mkdocs-material dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> * Update README.md Fix typo. * Fix CMakeLists.txt * Add sampling defaults for generate * Update llama.cpp * Add model_alias option to override model_path in completions. Closes abetlen#39 * Update variable name * Update llama.cpp * Fix top_k value. Closes abetlen#220 * Fix last_n_tokens_size * Implement penalize_nl * Format * Update token checks * Move docs link up * Fixd CUBLAS dll load issue in Windows * Check for CUDA_PATH before adding --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Andrei Betlen <[email protected]> Co-authored-by: Anchen <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Xiyou Zhou <[email protected]> Co-authored-by: Aneesh Joy <[email protected]>

* Add quantize script for batch quantization * Indentation * README for new quantize.sh * Fix script name * Fix file list on Mac OS --------- Co-authored-by: Georgi Gerganov <[email protected]>

gjmulder added model Model specific issue quality Quality of model output labels May 12, 2023

abetlen closed this as completed in 7a536e8 May 12, 2023

leszekhanusz mentioned this issue Jun 1, 2023

Working with long stories #307

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llama_tokenize: too many tokens #92

llama_tokenize: too many tokens #92

ylchin commented Apr 18, 2023

gjmulder commented May 12, 2023

Uh oh!

abetlen commented May 12, 2023

Uh oh!

llama_tokenize: too many tokens #92

llama_tokenize: too many tokens #92

Comments

ylchin commented Apr 18, 2023

gjmulder commented May 12, 2023

Uh oh!

abetlen commented May 12, 2023

Uh oh!