Add diffusers utils #104

philschmid · 2023-09-21T15:25:48Z

What does this PR do?

This PR adds support for diffusers and text-to-image task for zero-code deployments. This will allow customers to deploy any diffusers model support by AutoPipeline including:

This PR adds a new:

diffusers_utils.py which handles the heavy lifting for the zero-code deployment with new unit and integration tests
additional decoder for serializing the generated image into a binary for response
support for loading safetensors and sharded model from the hugging faced hub.

Test locally

manually change MMS_CONFIG_FILE

wget -O sagemaker-mms.properties https://github.com/raw/aws/deep-learning-containers/master/huggingface/build_artifacts/inference/config.properties

Run Container, e.g. text-to-image

HF_MODEL_ID="stabilityai/stable-diffusion-xl-base-1.0" HF_TASK="text-to-image" python src/sagemaker_huggingface_inference_toolkit/serving.py

Adjust handler_service.py and comment out if content_type in content_types.UTF8_TYPES: thats needed for SageMaker but cannot be used locally
Send request

curl --request POST \
  --url http://localhost:8080/invocations \
  --header 'Accept: image/png' \
  --header 'Content-Type: application/json' \
  --data '"{\"inputs\": \"Camera\"}" \
  --output image.png

…r-huggingface-inference-toolkit into add-diffusers-utils

davidthomas426 · 2023-09-21T17:40:42Z

From your description:
"3. Adjust handler_service.py and comment out if content_type in content_types.UTF8_TYPES: thats needed for SageMaker but cannot be used locally"

Why is that needed for SageMaker but cannot be used locally? That's a strange requirement for testing locally and does not give me confidence that everything here works correctly.

philschmid · 2023-09-22T07:20:47Z

From your description: "3. Adjust handler_service.py and comment out if content_type in content_types.UTF8_TYPES: thats needed for SageMaker but cannot be used locally"

Why is that needed for SageMaker but cannot be used locally? That's a strange requirement for testing locally and does not give me confidence that everything here works correctly.

It seems that when the container is deployed on SageMaker, the "body" is stringified and encoded. When starting the MMS server locally (no container), the body is just what you pass. Since we are not including the dockerfile here, we need to use MMS locally.

davidthomas426 · 2023-09-22T15:39:58Z

From your description: "3. Adjust handler_service.py and comment out if content_type in content_types.UTF8_TYPES: thats needed for SageMaker but cannot be used locally"
Why is that needed for SageMaker but cannot be used locally? That's a strange requirement for testing locally and does not give me confidence that everything here works correctly.

It seems that when the container is deployed on SageMaker, the "body" is stringified and encoded. When starting the MMS server locally (no container), the body is just what you pass. Since we are not including the dockerfile here, we need to use MMS locally.

Ok, but for whatever reason the "body" is a bytes or a str when called, your code could just handle that properly. So, if it's already a str, don't call decode. This is just python3 unicode stuff.

json.loads will call decode("utf-8") for you if handed a bytes. See the below:

>>> json_string = '{"🙃": 17}'
>>> json_bytes = json_string.encode("utf-8")

>>> dump = lambda v: print(f"{type(v)}  =>  {repr(v)}")
>>> dump(json_string)
<class 'str'>  =>  '{"🙃": 17}'
>>> dump(json_bytes)
<class 'bytes'>  =>  b'{"\xf0\x9f\x99\x83": 17}'

>>> json.loads(json_string)
{'🙃': 17}
>>> json.loads(json_bytes)
{'🙃': 17}

src/sagemaker_huggingface_inference_toolkit/diffusers_utils.py

src/sagemaker_huggingface_inference_toolkit/handler_service.py

src/sagemaker_huggingface_inference_toolkit/transformers_utils.py

davidthomas426 · 2023-09-22T16:15:09Z

src/sagemaker_huggingface_inference_toolkit/transformers_utils.py

+            trust_remote_code=TRUST_REMOTE_CODE,
+            model_kwargs={"device_map": "auto", "torch_dtype": torch_dtype},
+        )
+    elif is_diffusers_available() and task == "text-to-image":


Would it makes sense to go to else case if task == "text-to-image" but is_diffusers_availablereturnsFalse`?

no. There you need both.

But that's what this if-elif-else chain will do. If is_diffusers_available() is False and task is "text-to-image", then this branch will fail and it will go to the else clause.

So you're saying that doesn't make sense?

You mean we should rather error? That when task-to-image == True but diffusers not available?

Yeah, probably. Right now, the code does something that doesn't make sense in that case, because it falls through to the else clause.

I said that from the beginning. But yeah, at this point this back and forth is a bit ridiculous, so whatever. It's probably fine in practice. But still, this if-elif-else logic seems kind of wrong. I don't know how to be more clear.

davidthomas426 · 2023-09-22T16:22:10Z

src/sagemaker_huggingface_inference_toolkit/transformers_utils.py

-    # load pipeline
-    hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
+    if TRUST_REMOTE_CODE and os.environ.get("HF_MODEL_ID", None) is not None and device == 0:
+        torch_dtype = torch.bfloat16 if torch.cuda.get_device_capability()[0] == 8 else torch.float16


This if-elif-else seems odd to me. Could the if ever match when we want to use diffusers?

Also, why is this extra logic in the if case regarding bfloat16 and whatever else only used when TRUST_REMOTE_CODE is True, but then TRUST_REMOTE_CODE is also passed in other cases?

What is odd for you?

branch is when you have TRUST_REMOTE_CODE which is for custom modelling available through the hub and it only work if there is a HF_MODEL_ID and you have GPUs. -> was needed for mpt.

checks whether we have diffusers installed and if we have a diffusers task and to load image generation model.

is default where we try to load the custom pipeline

It's strange because there's special bfloat16 logic but only when TRUST_REMOTE_CODE is True. I mentioned that in my comment. And it also seemed strange to me because of what I said in my other comment below.

tests/integ/test_diffusers.py

philschmid · 2023-09-25T06:25:36Z

Ok, but for whatever reason the "body" is a bytes or a str when called, your code could just handle that properly. So, if it's already a str, don't call decode. This is just python3 unicode stuff.

json.loads will call decode("utf-8") for you if handed a bytes. See the below:

Thats what was implemented and what SageMaker is doing. I don't want to change something which is not wrong. It is just for testing. Or can you guarantee its not needed?

davidthomas426

I'm approving because I don't want to hold things up. But the lack of bfloat16 logic in any branch but the first TRUST_REMOTE_CODE one, and the other issue I pointed out with the if-elif-else clause, seem kind of wrong to me.

In practice, it may be fine. I'll leave that up to you.

davidthomas426 · 2023-11-15T21:46:18Z

src/sagemaker_huggingface_inference_toolkit/transformers_utils.py

+            trust_remote_code=TRUST_REMOTE_CODE,
+            model_kwargs={"device_map": "auto", "torch_dtype": torch_dtype},
+        )
+    elif is_diffusers_available() and task == "text-to-image":


Yeah, probably. Right now, the code does something that doesn't make sense in that case, because it falls through to the else clause.

I said that from the beginning. But yeah, at this point this back and forth is a bit ridiculous, so whatever. It's probably fine in practice. But still, this if-elif-else logic seems kind of wrong. I don't know how to be more clear.

philschmid · 2023-11-16T07:46:21Z

I'm approving because I don't want to hold things up. But the lack of bfloat16 logic in any branch but the first TRUST_REMOTE_CODE one, and the other issue I pointed out with the if-elif-else clause, seem kind of wrong to me.

Let me double check and make sure the tests are green.

philschmid and others added 13 commits June 6, 2023 15:08

add trust remote code flag

b39981c

draft, need pt2.0

f0c2eaf

added safetensore downloads

c08d0b0

added safetensors test

cf17102

add diffusers pipeline

24c423c

remove config

0760d75

Merge branch 'main' into add-diffusers-utils

a3c3b25

make style

925b649

Merge branch 'add-diffusers-utils' of https://github.com/aws/sagemake…

81cbd0f

…r-huggingface-inference-toolkit into add-diffusers-utils

add comment that neuron is not used yet

f26c7ce

style

7059780

add more detail to the comment

07a4eab

fix tests

c23a666

davidthomas426 reviewed Sep 22, 2023

View reviewed changes

philschmid added 2 commits October 6, 2023 07:31

update strbool

bbf162f

Update transformers_utils.py

e18c133

philschmid requested a review from davidthomas426 November 13, 2023 07:56

davidthomas426 approved these changes Nov 15, 2023

View reviewed changes

philschmid added 2 commits November 17, 2023 08:26

add dependency to fix unit tests

b31435a

upgrade python version

85f3e8a

philschmid merged commit a72f5d2 into main Nov 17, 2023

philschmid deleted the add-diffusers-utils branch November 17, 2023 08:35

This was referenced Nov 17, 2023

Prioritize safetensors format and support sharded weights #100

Closed

Sharded checkpoint support #93

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add diffusers utils #104

Add diffusers utils #104

philschmid commented Sep 21, 2023

davidthomas426 commented Sep 21, 2023

philschmid commented Sep 22, 2023

davidthomas426 commented Sep 22, 2023

davidthomas426 Sep 22, 2023

philschmid Sep 25, 2023

davidthomas426 Oct 6, 2023

philschmid Oct 6, 2023

davidthomas426 Nov 15, 2023

davidthomas426 Sep 22, 2023

philschmid Sep 25, 2023

davidthomas426 Oct 6, 2023

philschmid commented Sep 25, 2023

davidthomas426 left a comment

davidthomas426 Nov 15, 2023

philschmid commented Nov 16, 2023

Add diffusers utils #104

Add diffusers utils #104

Conversation

philschmid commented Sep 21, 2023

What does this PR do?

Test locally

davidthomas426 commented Sep 21, 2023

philschmid commented Sep 22, 2023

davidthomas426 commented Sep 22, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philschmid commented Sep 25, 2023

davidthomas426 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philschmid commented Nov 16, 2023