You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+62-23Lines changed: 62 additions & 23 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
# Chat with LLMs Everywhere
2
-
Torchchat is an easy-to-use library for running large language models (LLMs) on edge devices including mobile phones and desktops.
2
+
Torchchat is a small codebase to showcase running large language models (LLMs) within Python OR within your own (C/C++) application on mobile (iOS/Android), desktop and servers.
3
3
4
4
## Highlights
5
5
- Command line interaction with popular LLMs such as Llama 3, Llama 2, Stories, Mistral and more
@@ -12,10 +12,10 @@ Torchchat is an easy-to-use library for running large language models (LLMs) on
12
12
- iOS 17+ (iPhone 13 Pro+)
13
13
- Multiple data types including: float32, float16, bfloat16
The following steps require that you have [Python 3.10](https://www.python.org/downloads/release/python-3100/) installed.
20
20
21
21
```
@@ -32,31 +32,64 @@ source .venv/bin/activate
32
32
33
33
# ensure everything installed correctly
34
34
python3 torchchat.py --help
35
-
36
35
```
37
36
38
-
### Generating Text
39
-
40
-
```
41
-
python3 torchchat.py generate stories15M
42
-
```
43
-
That’s all there is to it!
44
-
Read on to learn how to use the full power of torchchat.
37
+
### Download Weights
38
+
Most models use HuggingFace as the distribution channel, so you will need to create a HuggingFace
39
+
account.
45
40
46
-
## Customization
47
-
For the full details on all commands and parameters run `python3 torchchat.py --help`
41
+
Create a HuggingFace user access token [as documented here](https://huggingface.co/docs/hub/en/security-tokens).
42
+
Run `huggingface-cli login`, which will prompt for the newly created token.
48
43
49
-
### Download
50
-
For supported models, torchchat can download model weights. Most models use HuggingFace as the distribution channel, so you will need to create a HuggingFace
51
-
account and install `huggingface-cli`.
52
-
53
-
To install `huggingface-cli`, run `pip install huggingface-cli`. After installing, create a user access token [as documented here](https://huggingface.co/docs/hub/en/security-tokens). Run `huggingface-cli login`, which will prompt for the newly created token. Once this is done, torchchat will be able to download model artifacts from
44
+
Once this is done, torchchat will be able to download model artifacts from
54
45
HuggingFace.
55
46
56
47
```
57
48
python3 torchchat.py download llama3
58
49
```
59
50
51
+
## What can you do with torchchat?
52
+
53
+
* Run models via PyTorch / Python:
54
+
*[Chat](#chat)
55
+
*[Generate](#generate)
56
+
*[Run via Browser](#browser)
57
+
*[Quantizing your model (suggested for mobile)](#quantization)
58
+
* Export and run models in native environments (C++, your own app, mobile, etc.)
59
+
*[Exporting for desktop/servers via AOTInductor](#export-server)
60
+
*[Running exported .so file via your own C++ application](#run-server)
61
+
* in Chat mode
62
+
* in Generate mode
63
+
*[Exporting for mobile via ExecuTorch](#export-executorch)
64
+
* in Chat mode
65
+
* in Generate mode
66
+
*[Running exported executorch file on iOS or Android](#run-mobile)
See the [documentation on GGUF](docs/GGUF.md) to learn how to use GGUF files.
89
+
90
+
91
+
## Running via PyTorch / Python
92
+
60
93
### Chat
61
94
Designed for interactive and conversational use.
62
95
In chat mode, the LLM engages in a back-and-forth dialogue with the user. It responds to queries, participates in discussions, provides explanations, and can adapt to the flow of conversation.
@@ -79,19 +112,25 @@ For more information run `python3 torchchat.py generate --help`
0 commit comments