Skip to content

Commit a95d564

Browse files
mikekgfbsoumith
authored andcommitted
Update README.md (#389)
Co-authored-by: Soumith Chintala <[email protected]>
1 parent b6ccd15 commit a95d564

File tree

1 file changed

+62
-23
lines changed

1 file changed

+62
-23
lines changed

README.md

Lines changed: 62 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Chat with LLMs Everywhere
2-
Torchchat is an easy-to-use library for running large language models (LLMs) on edge devices including mobile phones and desktops.
2+
Torchchat is a small codebase to showcase running large language models (LLMs) within Python OR within your own (C/C++) application on mobile (iOS/Android), desktop and servers.
33

44
## Highlights
55
- Command line interaction with popular LLMs such as Llama 3, Llama 2, Stories, Mistral and more
@@ -12,10 +12,10 @@ Torchchat is an easy-to-use library for running large language models (LLMs) on
1212
- iOS 17+ (iPhone 13 Pro+)
1313
- Multiple data types including: float32, float16, bfloat16
1414
- Multiple quantization schemes
15-
- Multiple execution modes including: Eager, Compile, AOT Inductor (AOTI) and ExecuTorch
15+
- Multiple execution modes including: Python (Eager, Compile) or Native (AOT Inductor (AOTI), ExecuTorch)
16+
17+
## Installation
1618

17-
## Quick Start
18-
### Initialize the Environment
1919
The following steps require that you have [Python 3.10](https://www.python.org/downloads/release/python-3100/) installed.
2020

2121
```
@@ -32,31 +32,64 @@ source .venv/bin/activate
3232
3333
# ensure everything installed correctly
3434
python3 torchchat.py --help
35-
3635
```
3736

38-
### Generating Text
39-
40-
```
41-
python3 torchchat.py generate stories15M
42-
```
43-
That’s all there is to it!
44-
Read on to learn how to use the full power of torchchat.
37+
### Download Weights
38+
Most models use HuggingFace as the distribution channel, so you will need to create a HuggingFace
39+
account.
4540

46-
## Customization
47-
For the full details on all commands and parameters run `python3 torchchat.py --help`
41+
Create a HuggingFace user access token [as documented here](https://huggingface.co/docs/hub/en/security-tokens).
42+
Run `huggingface-cli login`, which will prompt for the newly created token.
4843

49-
### Download
50-
For supported models, torchchat can download model weights. Most models use HuggingFace as the distribution channel, so you will need to create a HuggingFace
51-
account and install `huggingface-cli`.
52-
53-
To install `huggingface-cli`, run `pip install huggingface-cli`. After installing, create a user access token [as documented here](https://huggingface.co/docs/hub/en/security-tokens). Run `huggingface-cli login`, which will prompt for the newly created token. Once this is done, torchchat will be able to download model artifacts from
44+
Once this is done, torchchat will be able to download model artifacts from
5445
HuggingFace.
5546

5647
```
5748
python3 torchchat.py download llama3
5849
```
5950

51+
## What can you do with torchchat?
52+
53+
* Run models via PyTorch / Python:
54+
* [Chat](#chat)
55+
* [Generate](#generate)
56+
* [Run via Browser](#browser)
57+
* [Quantizing your model (suggested for mobile)](#quantization)
58+
* Export and run models in native environments (C++, your own app, mobile, etc.)
59+
* [Exporting for desktop/servers via AOTInductor](#export-server)
60+
* [Running exported .so file via your own C++ application](#run-server)
61+
* in Chat mode
62+
* in Generate mode
63+
* [Exporting for mobile via ExecuTorch](#export-executorch)
64+
* in Chat mode
65+
* in Generate mode
66+
* [Running exported executorch file on iOS or Android](#run-mobile)
67+
68+
## Models
69+
These are the supported models
70+
| Model | Mobile Friendly | Notes |
71+
|------------------|---|---------------------|
72+
|[meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)|||
73+
|[meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)|||
74+
|[meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)|||
75+
|[meta-llama/Llama-2-13b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)|||
76+
|[meta-llama/Llama-2-70b-chat-hf](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf)|||
77+
|[meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)|||
78+
|[meta-llama/CodeLlama-7b-Python-hf](https://huggingface.co/meta-llama/CodeLlama-7b-Python-hf)|||
79+
|[meta-llama/CodeLlama-34b-Python-hf](https://huggingface.co/meta-llama/CodeLlama-34b-Python-hf)|||
80+
|[mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)|||
81+
|[mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)|||
82+
|[mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)|||
83+
|[tinyllamas/stories15M](https://huggingface.co/karpathy/tinyllamas/tree/main)|||
84+
|[tinyllamas/stories42M](https://huggingface.co/karpathy/tinyllamas/tree/main)|||
85+
|[tinyllamas/stories110M](https://huggingface.co/karpathy/tinyllamas/tree/main)|||
86+
|[openlm-research/open_llama_7b](https://huggingface.co/karpathy/tinyllamas/tree/main)|||
87+
88+
See the [documentation on GGUF](docs/GGUF.md) to learn how to use GGUF files.
89+
90+
91+
## Running via PyTorch / Python
92+
6093
### Chat
6194
Designed for interactive and conversational use.
6295
In chat mode, the LLM engages in a back-and-forth dialogue with the user. It responds to queries, participates in discussions, provides explanations, and can adapt to the flow of conversation.
@@ -79,19 +112,25 @@ For more information run `python3 torchchat.py generate --help`
79112
python3 torchchat.py generate llama3 --dtype=fp16 --tiktoken
80113
```
81114

82-
### Export
115+
## Exporting your model
83116
Compiles a model and saves it to run later.
84117

85118
For more information run `python3 torchchat.py export --help`
86119

87-
**Examples**
120+
### Exporting for Desktop / Server-side via AOT Inductor
88121

89-
AOT Inductor:
90122
```
91123
python3 torchchat.py export stories15M --output-dso-path stories15M.so
92124
```
93125

94-
ExecuTorch:
126+
This produces a `.so` file, also called a Dynamic Shared Object. This `.so` can be linked into your own C++ program.
127+
128+
### Running the exported `.so` via your own C++ application
129+
130+
[TBF]
131+
132+
### Exporting for Mobile via ExecuTorch
133+
95134
```
96135
python3 torchchat.py export stories15M --output-pte-path stories15M.pte
97136
```

0 commit comments

Comments
 (0)