Skip to content

Add diffusers utils #104

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Nov 17, 2023
Merged

Add diffusers utils #104

merged 17 commits into from
Nov 17, 2023

Conversation

philschmid
Copy link
Collaborator

What does this PR do?

This PR adds support for diffusers and text-to-image task for zero-code deployments. This will allow customers to deploy any diffusers model support by AutoPipeline including:

This PR adds a new:

  • diffusers_utils.py which handles the heavy lifting for the zero-code deployment with new unit and integration tests
  • additional decoder for serializing the generated image into a binary for response
  • support for loading safetensors and sharded model from the hugging faced hub.

Test locally

  1. manually change MMS_CONFIG_FILE
wget -O sagemaker-mms.properties https://github.com/raw/aws/deep-learning-containers/master/huggingface/build_artifacts/inference/config.properties
  1. Run Container, e.g. text-to-image
HF_MODEL_ID="stabilityai/stable-diffusion-xl-base-1.0" HF_TASK="text-to-image" python src/sagemaker_huggingface_inference_toolkit/serving.py
  1. Adjust handler_service.py and comment out if content_type in content_types.UTF8_TYPES: thats needed for SageMaker but cannot be used locally

  2. Send request

curl --request POST \
  --url http://localhost:8080/invocations \
  --header 'Accept: image/png' \
  --header 'Content-Type: application/json' \
  --data '"{\"inputs\": \"Camera\"}" \
  --output image.png

Bildschirmfoto 2023-09-20 um 18 30 55

@davidthomas426
Copy link
Contributor

From your description:
"3. Adjust handler_service.py and comment out if content_type in content_types.UTF8_TYPES: thats needed for SageMaker but cannot be used locally"

Why is that needed for SageMaker but cannot be used locally? That's a strange requirement for testing locally and does not give me confidence that everything here works correctly.

@philschmid
Copy link
Collaborator Author

From your description: "3. Adjust handler_service.py and comment out if content_type in content_types.UTF8_TYPES: thats needed for SageMaker but cannot be used locally"

Why is that needed for SageMaker but cannot be used locally? That's a strange requirement for testing locally and does not give me confidence that everything here works correctly.

It seems that when the container is deployed on SageMaker, the "body" is stringified and encoded. When starting the MMS server locally (no container), the body is just what you pass. Since we are not including the dockerfile here, we need to use MMS locally.

@davidthomas426
Copy link
Contributor

From your description: "3. Adjust handler_service.py and comment out if content_type in content_types.UTF8_TYPES: thats needed for SageMaker but cannot be used locally"
Why is that needed for SageMaker but cannot be used locally? That's a strange requirement for testing locally and does not give me confidence that everything here works correctly.

It seems that when the container is deployed on SageMaker, the "body" is stringified and encoded. When starting the MMS server locally (no container), the body is just what you pass. Since we are not including the dockerfile here, we need to use MMS locally.

Ok, but for whatever reason the "body" is a bytes or a str when called, your code could just handle that properly. So, if it's already a str, don't call decode. This is just python3 unicode stuff.

json.loads will call decode("utf-8") for you if handed a bytes. See the below:

>>> json_string = '{"🙃": 17}'
>>> json_bytes = json_string.encode("utf-8")

>>> dump = lambda v: print(f"{type(v)}  =>  {repr(v)}")
>>> dump(json_string)
<class 'str'>  =>  '{"🙃": 17}'
>>> dump(json_bytes)
<class 'bytes'>  =>  b'{"\xf0\x9f\x99\x83": 17}'

>>> json.loads(json_string)
{'🙃': 17}
>>> json.loads(json_bytes)
{'🙃': 17}

trust_remote_code=TRUST_REMOTE_CODE,
model_kwargs={"device_map": "auto", "torch_dtype": torch_dtype},
)
elif is_diffusers_available() and task == "text-to-image":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it makes sense to go to else case if task == "text-to-image" but is_diffusers_availablereturnsFalse`?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. There you need both.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that's what this if-elif-else chain will do. If is_diffusers_available() is False and task is "text-to-image", then this branch will fail and it will go to the else clause.

So you're saying that doesn't make sense?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean we should rather error? That when task-to-image == True but diffusers not available?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, probably. Right now, the code does something that doesn't make sense in that case, because it falls through to the else clause.

I said that from the beginning. But yeah, at this point this back and forth is a bit ridiculous, so whatever. It's probably fine in practice. But still, this if-elif-else logic seems kind of wrong. I don't know how to be more clear.

# load pipeline
hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
if TRUST_REMOTE_CODE and os.environ.get("HF_MODEL_ID", None) is not None and device == 0:
torch_dtype = torch.bfloat16 if torch.cuda.get_device_capability()[0] == 8 else torch.float16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if-elif-else seems odd to me. Could the if ever match when we want to use diffusers?

Also, why is this extra logic in the if case regarding bfloat16 and whatever else only used when TRUST_REMOTE_CODE is True, but then TRUST_REMOTE_CODE is also passed in other cases?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is odd for you?

  1. branch is when you have TRUST_REMOTE_CODE which is for custom modelling available through the hub and it only work if there is a HF_MODEL_ID and you have GPUs. -> was needed for mpt.
  2. checks whether we have diffusers installed and if we have a diffusers task and to load image generation model.
  3. is default where we try to load the custom pipeline

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's strange because there's special bfloat16 logic but only when TRUST_REMOTE_CODE is True. I mentioned that in my comment. And it also seemed strange to me because of what I said in my other comment below.

@philschmid
Copy link
Collaborator Author

Ok, but for whatever reason the "body" is a bytes or a str when called, your code could just handle that properly. So, if it's already a str, don't call decode. This is just python3 unicode stuff.

json.loads will call decode("utf-8") for you if handed a bytes. See the below:

Thats what was implemented and what SageMaker is doing. I don't want to change something which is not wrong. It is just for testing. Or can you guarantee its not needed?

Copy link
Contributor

@davidthomas426 davidthomas426 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving because I don't want to hold things up. But the lack of bfloat16 logic in any branch but the first TRUST_REMOTE_CODE one, and the other issue I pointed out with the if-elif-else clause, seem kind of wrong to me.

In practice, it may be fine. I'll leave that up to you.

trust_remote_code=TRUST_REMOTE_CODE,
model_kwargs={"device_map": "auto", "torch_dtype": torch_dtype},
)
elif is_diffusers_available() and task == "text-to-image":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, probably. Right now, the code does something that doesn't make sense in that case, because it falls through to the else clause.

I said that from the beginning. But yeah, at this point this back and forth is a bit ridiculous, so whatever. It's probably fine in practice. But still, this if-elif-else logic seems kind of wrong. I don't know how to be more clear.

@philschmid
Copy link
Collaborator Author

I'm approving because I don't want to hold things up. But the lack of bfloat16 logic in any branch but the first TRUST_REMOTE_CODE one, and the other issue I pointed out with the if-elif-else clause, seem kind of wrong to me.

Let me double check and make sure the tests are green.

@philschmid philschmid merged commit a72f5d2 into main Nov 17, 2023
@philschmid philschmid deleted the add-diffusers-utils branch November 17, 2023 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants