Develop #442

philpax · 2023-11-12T22:19:20Z

The pending PRs were interrelated, but I didn't want to leave main in a half-working state, so I've merged all the PRs into a new develop branch. The plan is to work on this branch and leave main in maintenance mode until this is ready.

Closes #365, closes #403, closes #439, closes #77.

This integrates:

a GGML version upgrade
GGUF support
BERT support
APIs for context-shuffling

This is the to-do list:

Co-authored-by: Lukas Kreussel <[email protected]> Co-authored-by: Philpax <[email protected]>

* with some heavy caveats, see the PR

…napshot read/write

add bert model

Build against newer GGML version

Add "context swap" functions to session and add "decoded_tokens" to snapshot read/write

GGUF support

oppiliappan and others added 30 commits August 7, 2023 14:55

add bert model

7a10cfb

Co-authored-by: Lukas Kreussel <[email protected]> Co-authored-by: Philpax <[email protected]>

refactor: move ggml format to module

ffb0519

fix(ggml): use byte-arrays for magic

d5c2562

feat(ggml): impl unwired gguf loader

dd7aa26

Remove error on context window overflow

b9b1391

feat(gguf): gguf-v2 support

e166b7c

chore(gguf): clippy fix

90c6797

Merge branch 'main' into gguf

ddf4e40

fix(gguf): drop the null terminator

38dd730

refactor(ggml): begin decoupling the old formats

41462ed

feat(bin): add gguf-explorer as debugging tool

2de2df7

refactor(gguf): split metadata out + use macros

0da661f

wip: rewrite loader to use GGUF; almost wire up llama

e182444

fix(cli): use info log level

178a0fb

wip: successfully load a llama2 gguf*

2a9417a

* with some heavy caveats, see the PR

wip: disable everything that's broken

823828d

feat(llama): validate tensor data layout

eb8c508

Add "context swap" functions to session and add "decoded_tokens" to s…

99a9fb4

…napshot read/write

Build a against newer GGML version

6835335

Update llama-cpp

1eb0d79

Include ggml-alloc.c during build

ad136e1

Merge remote-tracking branch 'upstream/main' into feat/ggml-update

fd3ff64

Hopefully fix linux build

ab381c7

Remove Scratch Buffers

4ebb16e

Use GraphAllocator in LLaMA architecture

995dd79

Working graph allocator for llama

6ba5126

Scope input_length and session_len to BuildContext

78b0e25

Logging + mpt tests

8ad589b

Try to set the cuda scratch offset

e506b0b

feat(llm): remove architecture param

f398ebd

philpax added 29 commits October 23, 2023 02:52

Merge branch 'main' into gguf

43ebc3d

fix(ggml/llmb): use IndexMap for GGUF

e4db5b9

fix(llmb): disable embedded tokenizer

8996061

refactor: move loading logic back into llmb, simplify

34379ac

feat: implement GGUF write / llm gguf rebuild

a4bbdbf

feat(llm): implement gguf add-hf-tokenizer

df1aa0e

fix(gguf): add support for ggufv3

d5e7b61

fix(gguf): load bools correctly

5457414

feat(llm): get GPT-NeoX loading again

6114076

feat(llm): get GPT-NeoX closer to working

8961ff7

feat(llm): more attempted GPT-NeoX fixes

be709ed

fix(cli): in info, elide known large items

a728852

fix(llmb): usercallback show error

5ed38be

fix(ggml): bindgen issues

fcbfb4d

Merge pull request #398 from nerdypepper/add-bert-arch

52c2bb6

add bert model

Merge branch 'develop' into feat/ggml-update

5e4b35f

Merge pull request #428 from LLukas22/feat/ggml-update

e5e0fe1

Build against newer GGML version

Merge branch 'develop' into andreybest-main

2e3c6f7

Merge pull request #424 from Andreybest/main

2c127df

Add "context swap" functions to session and add "decoded_tokens" to snapshot read/write

chore: fix precommit

eddf953

Merge in develop

ab956c9

chore: fix precommit

7c3d1cf

Merge branch 'develop' into gguf

4401631

Merge pull request #412 from rustformers/gguf

535eda1

GGUF support

Merge branch 'main' into develop

d318da1

wip: update ggml crate only

bd63f63

Start porting llama.cpp + simple.cpp to Rust

ce927dd

feat(lcsc): start implementing llama functions

ab85bf6

chore: update llama-cpp

00aaf4a

philpax closed this Jun 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Develop #442

Develop #442

Uh oh!

philpax commented Nov 12, 2023 •

edited

Loading

Uh oh!

Uh oh!

Develop #442

Develop #442

Uh oh!

Conversation

philpax commented Nov 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

philpax commented Nov 12, 2023 •

edited

Loading