-
Notifications
You must be signed in to change notification settings - Fork 11.8k
🚀 Dockerize llamacpp #132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
🚀 Dockerize llamacpp #132
Changes from all commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
ce509c7
feat: dockerize llamacpp
bernatvadell 6d9ad10
feat: split build & runtime stages
bernatvadell 9959b1f
split dockerfile into main & tools
bernatvadell a4590d3
add quantize into tool docker image
bernatvadell 901c34d
Update .devops/tools.sh
bernatvadell 44f7467
add docker action pipeline
bernatvadell 60cf707
Merge branch 'master' into feat/dockerize
bernatvadell 3bcfc2b
change CI to publish at github docker registry
bernatvadell c202819
fix name runs-on macOS-latest is macos-latest (lowercase)
bernatvadell c6b2c6f
include docker versioned images
bernatvadell 4941df7
fix github action docker
bernatvadell 0bc1e80
fix docker.yml
bernatvadell 50fa1a0
Merge branch 'master' into feat/dockerize
bernatvadell 79a48d9
feat: include all-in-one command tool & update readme.md
bernatvadell File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
ARG UBUNTU_VERSION=22.04 | ||
|
||
FROM ubuntu:$UBUNTU_VERSION as build | ||
|
||
RUN apt-get update && \ | ||
apt-get install -y build-essential python3 python3-pip | ||
|
||
RUN pip install --upgrade pip setuptools wheel \ | ||
&& pip install torch torchvision torchaudio sentencepiece numpy | ||
|
||
WORKDIR /app | ||
|
||
COPY . . | ||
|
||
RUN make | ||
|
||
ENTRYPOINT ["/app/.devops/tools.sh"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
ARG UBUNTU_VERSION=22.04 | ||
|
||
FROM ubuntu:$UBUNTU_VERSION as build | ||
|
||
RUN apt-get update && \ | ||
apt-get install -y build-essential | ||
|
||
WORKDIR /app | ||
|
||
COPY . . | ||
|
||
RUN make | ||
|
||
FROM ubuntu:$UBUNTU_VERSION as runtime | ||
|
||
COPY --from=build /app/main /main | ||
|
||
ENTRYPOINT [ "/main" ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
#!/bin/bash | ||
set -e | ||
|
||
# Read the first argument into a variable | ||
arg1="$1" | ||
|
||
# Shift the arguments to remove the first one | ||
shift | ||
|
||
# Join the remaining arguments into a single string | ||
arg2="$@" | ||
|
||
if [[ $arg1 == '--convert' || $arg1 == '-c' ]]; then | ||
python3 ./convert-pth-to-ggml.py $arg2 | ||
elif [[ $arg1 == '--quantize' || $arg1 == '-q' ]]; then | ||
./quantize $arg2 | ||
elif [[ $arg1 == '--run' || $arg1 == '-r' ]]; then | ||
./main $arg2 | ||
elif [[ $arg1 == '--download' || $arg1 == '-d' ]]; then | ||
python3 ./download-pth.py $arg2 | ||
elif [[ $arg1 == '--all-in-one' || $arg1 == '-a' ]]; then | ||
echo "Downloading model..." | ||
python3 ./download-pth.py "$1" "$2" | ||
echo "Converting PTH to GGML..." | ||
for i in `ls $1/$2/ggml-model-f16.bin*`; do | ||
if [ -f "${i/f16/q4_0}" ]; then | ||
echo "Skip model quantization, it already exists: ${i/f16/q4_0}" | ||
else | ||
echo "Converting PTH to GGML: $i into ${i/f16/q4_0}..." | ||
./quantize "$i" "${i/f16/q4_0}" 2 | ||
fi | ||
done | ||
else | ||
echo "Unknown command: $arg1" | ||
echo "Available commands: " | ||
echo " --run (-r): Run a model previously converted into ggml" | ||
echo " ex: -m /models/7B/ggml-model-q4_0.bin -p \"Building a website can be done in 10 simple steps:\" -t 8 -n 512" | ||
echo " --convert (-c): Convert a llama model into ggml" | ||
echo " ex: \"/models/7B/\" 1" | ||
echo " --quantize (-q): Optimize with quantization process ggml" | ||
echo " ex: \"/models/7B/ggml-model-f16.bin\" \"/models/7B/ggml-model-q4_0.bin\" 2" | ||
echo " --download (-d): Download original llama model from CDN: https://agi.gpt4.org/llama/" | ||
echo " ex: \"/models/\" 7B" | ||
echo " --all-in-one (-a): Execute --download, --convert & --quantize" | ||
echo " ex: \"/models/\" 7B" | ||
fi |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
*.o | ||
*.a | ||
.cache/ | ||
.vs/ | ||
.vscode/ | ||
.DS_Store | ||
|
||
build/ | ||
build-em/ | ||
build-debug/ | ||
build-release/ | ||
build-static/ | ||
build-no-accel/ | ||
build-sanitize-addr/ | ||
build-sanitize-thread/ | ||
|
||
models/* | ||
|
||
/main | ||
/quantize | ||
|
||
arm_neon.h | ||
compile_commands.json | ||
Dockerfile |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -19,7 +19,7 @@ jobs: | |
make | ||
|
||
macOS-latest: | ||
runs-on: macOS-latest | ||
runs-on: macos-latest | ||
|
||
steps: | ||
- name: Clone | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
# This workflow uses actions that are not certified by GitHub. | ||
# They are provided by a third-party and are governed by | ||
# separate terms of service, privacy policy, and support | ||
# documentation. | ||
|
||
# GitHub recommends pinning actions to a commit SHA. | ||
# To get a newer version, you will need to update the SHA. | ||
# You can also reference a tag or branch, but the action may change without warning. | ||
|
||
name: Publish Docker image | ||
|
||
on: | ||
pull_request: | ||
push: | ||
branches: | ||
- master | ||
|
||
jobs: | ||
push_to_registry: | ||
name: Push Docker image to Docker Hub | ||
runs-on: ubuntu-latest | ||
env: | ||
COMMIT_SHA: ${{ github.sha }} | ||
strategy: | ||
matrix: | ||
config: | ||
- { tag: "light", dockerfile: ".devops/main.Dockerfile" } | ||
- { tag: "full", dockerfile: ".devops/full.Dockerfile" } | ||
steps: | ||
- name: Check out the repo | ||
uses: actions/checkout@v3 | ||
|
||
- name: Set up QEMU | ||
uses: docker/setup-qemu-action@v2 | ||
|
||
- name: Set up Docker Buildx | ||
uses: docker/setup-buildx-action@v2 | ||
|
||
- name: Log in to Docker Hub | ||
uses: docker/login-action@v2 | ||
with: | ||
registry: ghcr.io | ||
username: ${{ github.actor }} | ||
password: ${{ secrets.GITHUB_TOKEN }} | ||
|
||
- name: Build and push Docker image (versioned) | ||
if: github.event_name == 'push' | ||
uses: docker/build-push-action@v4 | ||
with: | ||
context: . | ||
push: true | ||
tags: "ghcr.io/ggerganov/llama.cpp:${{ matrix.config.tag }}-${{ env.COMMIT_SHA }}" | ||
file: ${{ matrix.config.dockerfile }} | ||
|
||
- name: Build and push Docker image (tagged) | ||
uses: docker/build-push-action@v4 | ||
with: | ||
context: . | ||
push: ${{ github.event_name == 'push' }} | ||
tags: "ghcr.io/ggerganov/llama.cpp:${{ matrix.config.tag }}" | ||
file: ${{ matrix.config.dockerfile }} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
import os | ||
import sys | ||
from tqdm import tqdm | ||
import requests | ||
|
||
if len(sys.argv) < 3: | ||
print("Usage: download-pth.py dir-model model-type\n") | ||
print(" model-type: Available models 7B, 13B, 30B or 65B") | ||
sys.exit(1) | ||
|
||
modelsDir = sys.argv[1] | ||
model = sys.argv[2] | ||
|
||
num = { | ||
"7B": 1, | ||
"13B": 2, | ||
"30B": 4, | ||
"65B": 8, | ||
} | ||
|
||
if model not in num: | ||
print(f"Error: model {model} is not valid, provide 7B, 13B, 30B or 65B") | ||
sys.exit(1) | ||
|
||
print(f"Downloading model {model}") | ||
|
||
files = ["checklist.chk", "params.json"] | ||
|
||
for i in range(num[model]): | ||
files.append(f"consolidated.0{i}.pth") | ||
|
||
resolved_path = os.path.abspath(os.path.join(modelsDir, model)) | ||
os.makedirs(resolved_path, exist_ok=True) | ||
|
||
for file in files: | ||
dest_path = os.path.join(resolved_path, file) | ||
|
||
if os.path.exists(dest_path): | ||
print(f"Skip file download, it already exists: {file}") | ||
continue | ||
|
||
url = f"https://agi.gpt4.org/llama/LLaMA/{model}/{file}" | ||
response = requests.get(url, stream=True) | ||
with open(dest_path, 'wb') as f: | ||
with tqdm(unit='B', unit_scale=True, miniters=1, desc=file) as t: | ||
for chunk in response.iter_content(chunk_size=1024): | ||
if chunk: | ||
f.write(chunk) | ||
t.update(len(chunk)) | ||
|
||
files2 = ["tokenizer_checklist.chk", "tokenizer.model"] | ||
for file in files2: | ||
dest_path = os.path.join(modelsDir, file) | ||
|
||
if os.path.exists(dest_path): | ||
print(f"Skip file download, it already exists: {file}") | ||
continue | ||
|
||
url = f"https://agi.gpt4.org/llama/LLaMA/{file}" | ||
response = requests.get(url, stream=True) | ||
with open(dest_path, 'wb') as f: | ||
with tqdm(unit='B', unit_scale=True, miniters=1, desc=file) as t: | ||
for chunk in response.iter_content(chunk_size=1024): | ||
if chunk: | ||
f.write(chunk) | ||
t.update(len(chunk)) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.