You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Sep 10, 2025. It is now read-only.
* Add build_native.sh and add README.md
Summary:
Added a script to build C++ runner for ET and AOTI. Updated README.md to
ask users to run it.
Made some improvement on building speed, by reducing duplicate build
command. Now we can rely on `install_requirements.sh` to install all of
the C++ libraries needed for runner.
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
* Revert custom ops change
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
* Add build_native.sh to CI job
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
* Add README for building native runner for aoti
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
Copy file name to clipboardExpand all lines: README.md
+31-4Lines changed: 31 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -73,11 +73,10 @@ with `python3 torchchat.py remove llama3`.
73
73
*[Run exported .so file via your own C++ application](#run-server)
74
74
* in Chat mode
75
75
* in Generate mode
76
-
*[Export for mobile via ExecuTorch](#export-executorch)
76
+
*[Export for mobile via ExecuTorch](#exporting-for-mobile-via-executorch)
77
+
*[Run exported ExecuTorch file on iOS or Android](#mobile-execution)
77
78
* in Chat mode
78
79
* in Generate mode
79
-
*[Run exported ExecuTorch file on iOS or Android](#run-mobile)
80
-
81
80
82
81
## Running via PyTorch / Python
83
82
@@ -235,6 +234,20 @@ python3 torchchat.py generate --dso-path stories15M.so --prompt "Hello my name i
235
234
236
235
NOTE: The exported model will be large. We suggest you quantize the model, explained further down, before deploying the model on device.
237
236
237
+
**Build Native Runner Binary**
238
+
239
+
We provide an end-to-end C++ [runner](runner/run.cpp) that runs the `*.so` file exported after following the previous [examples](#aoti-aot-inductor) section. To build the runner binary on your Mac or Linux:
240
+
241
+
```bash
242
+
scripts/build_native.sh aoti
243
+
```
244
+
245
+
Run:
246
+
247
+
```bash
248
+
cmake-out/aoti_run model.so -z tokenizer.model -i "Once upon a time"
249
+
```
250
+
238
251
### ExecuTorch
239
252
240
253
ExecuTorch enables you to optimize your model for execution on a mobile or embedded device, but can also be used on desktop for testing.
python3 torchchat.py generate --device cpu --pte-path stories15M.pte --prompt "Hello my name is"
251
264
```
252
265
253
-
See below under [Mobile Execution](#run-mobile) if you want to deploy and execute a model in your iOS or Android app.
266
+
See below under [Mobile Execution](#mobile-execution) if you want to deploy and execute a model in your iOS or Android app.
254
267
255
268
256
269
@@ -265,6 +278,20 @@ Read the [iOS documentation](docs/iOS.md) for more details on iOS.
265
278
266
279
Read the [Android documentation](docs/Android.md) for more details on Android.
267
280
281
+
**Build Native Runner Binary**
282
+
283
+
We provide an end-to-end C++ [runner](runner/run.cpp) that runs the `*.pte` file exported after following the previous [ExecuTorch](#executorch) section. Notice that this binary is for demo purpose, please follow the respective documentations, to see how to build a similar application on iOS and Android. To build the runner binary on your Mac or Linux:
284
+
285
+
```bash
286
+
scripts/build_native.sh et
287
+
```
288
+
289
+
Run:
290
+
291
+
```bash
292
+
cmake-out/et_run model.pte -z tokenizer.model -i "Once upon a time"
293
+
```
294
+
268
295
## Fine-tuned models from torchtune
269
296
270
297
torchchat supports running inference with models fine-tuned using [torchtune](https://github.com/pytorch/torchtune). To do so, we first need to convert the checkpoints into a format supported by torchchat.
${TORCHCHAT_ROOT}/${ET_BUILD_DIR}/src/executorch/${CMAKE_OUT_DIR}/extension/data_loader/libextension_data_loader.a # This one does not get installed by ExecuTorch
52
-
optimized_kernels
53
-
quantized_kernels
54
-
portable_kernels
55
-
cpublas
56
-
eigen_blas
57
-
# The libraries below need to be whole-archived linked
58
-
optimized_native_cpu_ops_lib
59
-
quantized_ops_lib
60
-
xnnpack_backend
61
-
XNNPACK
62
-
pthreadpool
63
-
cpuinfo
52
+
executorch
53
+
extension_module
54
+
extension_data_loader
55
+
optimized_kernels
56
+
quantized_kernels
57
+
portable_kernels
58
+
cpublas
59
+
eigen_blas
60
+
# The libraries below need to be whole-archived linked
0 commit comments