-
Notifications
You must be signed in to change notification settings - Fork 309
feat: Amazon SageMaker compatible images #103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
+ Added sagemaker target to Dockerfile and custom entrypoint + Added build-and-push-sagemaker-image step to build_* workflows
Hi, happen to know if this supports |
It does ( Running the example request in https://github.com/huggingface/text-embeddings-inference#using-re-rankers-models |
Hi team, is there an expected date for merging this PR? I would like to use these images :) |
FYI - here's a working notebook to deploy TEI on SageMaker in the meantime: https://github.com/andrewrreed/hf-notebooks/blob/main/deploy-tei-sagemaker/tei-sagemaker-inference.ipynb |
Hey @JGalego, Thank you for starting this work. We would need to make some changes to match the environment variables, similar to TGI. https://github.com/huggingface/text-generation-inference/blob/main/sagemaker-entrypoint.sh Also due to complexity of SageMaker, we will build 1 GPU container containing all binaries for the target (https://github.com/huggingface/text-embeddings-inference/blob/main/Dockerfile-cuda-all). |
Hey @philschmid, I can add the env var mappings for Regarding the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets start only with the "all" versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this! As soon as this is merged we ll work with the SageMaker to make it officially available in side the sagemaker-sdk
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this and sorry for the delay!
@JGalego Hi, I have a question. Does this work when using an S3 path for model artifacts instead of HF Hub, as in I get an endpoint error if I don't specify Appreciate any advice, thanks! |
For the SM images,
|
@JGalego Would it be possible to update this so AWS customers can alternatively use S3 for model storage? I can submit a feature request if needed. |
Hi @philschmid, I set Here's my code: |
@austinmw This looks more that your artifact is not correctly created. Can you check https://huggingface.co/docs/sagemaker/inference#create-a-model-artifact-for-deployment |
@philschmid Just tried with new artifacts using the provided steps and got the same exact error:
Is it possible this feature is broken for TEI? |
Nope it should work. I am working on a SageMaker end-to-end example with training and deployment and it works. Can you try uploading your model unzipped and then deploy it using the "uncompressed" setting? s3_path="s3://path/to/ymodel" model = HuggingFaceModel(
role=role,
# path to s3 bucket with model, we are not using a compressed model
model_data={'S3DataSource':{'S3Uri': s3_path + "/",'S3DataType': 'S3Prefix','CompressionType': 'None'}},
image_uri=image,
env=config
)``` |
That worked, thanks a ton for your help! |
What does this PR do?
Adds support for Amazon SageMaker compatible images. Similar to huggingface/text-generation-inference#147, only for TEI.
Mostly CI stuff and some hacks, since the Amazon SageMaker routes are already implemented.
Implementation:
sagemaker
target toDockerfile-cuda
and custom entrypointbuild-and-push-sagemaker-image
step tobuild_*
workflowsWho can review?
@OlivierDehaene