Closed
Description
I'm trying to run Qwen3-0.6B model on Android using XNNPACK, following the instructions in qwen3 example and step4 of llama example.
When running ./llama_main --model_path qwen3-0_6b.pte --tokenizer_path tokenizer.json --prompt "Hi" --seq_len 120
on Galaxy S23, I got the following error while setting up pretokenizer.
I 00:00:00.003396 executorch:cpuinfo_utils.cpp:62] Reading file /sys/devices/soc0/image_version
I 00:00:00.003583 executorch:main.cpp:76] Resetting threadpool with num threads = 4
I 00:00:00.008963 executorch:runner.cpp:90] Creating LLaMa runner: model_path=qwen3-0_6b.pte, tokenizer_path=tokenizer.json
Setting up pretokenizer...
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1747192804.212017 24371 re2.cc:237] Error parsing '((?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\r\n\p{L}\p{N}]?\p{L}+|\p{N}| ?[^\s\p{L}\p{N}]+[\r\n]*|\s*[\r\n]+|\s...': invalid perl operator: (?!
RE2 failed to compile pattern with lookahead: (?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\r\n\p{L}\p{N}]?\p{L}+|\p{N}| ?[^\s\p{L}\p{N}]+[\r\n]*|\s*[\r\n]+|\s+(?!\S)|\s+
Error: invalid perl operator: (?!
Compile with SUPPORT_REGEX_LOOKAHEAD=ON to enable support for lookahead patterns.
libc++abi: terminating due to uncaught exception of type std::runtime_error: Error: 4
Aborted
It seems the runner now accepts .json
format for the tokenizer. Is there anything I'm missing?
Environments
- executorch main branch (b173722)
- Android NDK r28b
- Galaxy S23
- Run
install_executorch.sh --pybind xnnpack
andexamples/models/llama/install_requirements.sh
.
Commands used (used commands in the instructions)
- model
python -m examples.models.llama.export_llama \ --model qwen3-0_6b \ --params examples/models/qwen3/0_6b_config.json \ -kv \ --use_sdpa_with_kv_cache \ -d fp32 \ -X \ --xnnpack-extended-ops \ -qmode 8da4w \ --metadata '{"get_bos_id": 151644, "get_eos_ids":[151645]}' \ --output_name="qwen3-0_6b.pte" \ --verbose
- runner
cmake -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \ -DANDROID_ABI=arm64-v8a \ -DANDROID_PLATFORM=android-23 \ -DCMAKE_INSTALL_PREFIX=cmake-out-android \ -DCMAKE_BUILD_TYPE=Release \ -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \ -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \ -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \ -DEXECUTORCH_ENABLE_LOGGING=1 \ -DPYTHON_EXECUTABLE=python \ -DEXECUTORCH_BUILD_XNNPACK=ON \ -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \ -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \ -DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON \ -Bcmake-out-android . cmake --build cmake-out-android -j16 --target install --config Release cmake -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \ -DANDROID_ABI=arm64-v8a \ -DANDROID_PLATFORM=android-23 \ -DCMAKE_INSTALL_PREFIX=cmake-out-android \ -DCMAKE_BUILD_TYPE=Release \ -DPYTHON_EXECUTABLE=python \ -DEXECUTORCH_BUILD_XNNPACK=ON \ -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \ -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \ -DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON \ -Bcmake-out-android/examples/models/llama \ examples/models/llama cmake --build cmake-out-android/examples/models/llama -j16 --config Release
After, adb pushed .pte, tokenizer.json, llama_main to the device and executed with the command above.
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
kirklandsign commentedon May 14, 2025
Hi @kkimmk
Please add
-DSUPPORT_REGEX_LOOKAHEAD=ON
Something like
kirklandsign commentedon May 14, 2025
cc @jackzhxng should we update the flag update in docs? https://github.com/pytorch/executorch/tree/main/examples/models/qwen3#example-run
jackzhxng commentedon May 14, 2025
Yup, let's do it
kkimmk commentedon May 15, 2025
Adding
-DSUPPORT_REGEX_LOOKAHEAD=ON
solved the error. Thank you