Sharded checkpoint support #93

JimAllanson · 2023-07-09T18:09:37Z

Description of changes:

In version 4.18.0 of Transformers, support for sharded checkpoints was added: huggingface/transformers#16343
Models using this sharded format now exist in the wild, for example in Blip2 (PyTorch) or ESM-2 (TensorFlow+PyTorch)

These models currently fail to load in the AWS Deep Learning Containers with huggingface support, as the model files are filtered out during the initial cache population stage.

This change adds support for matching model files with filenames in the sharded format, as well as the sharded model index files that list the parts.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

philschmid · 2023-11-17T08:36:58Z

Hey @BaiqingL, this PR #104 should have fixed that.

JimAllanson added 4 commits July 9, 2023 17:17

Add support for downloading sharded models

4d93ce8

Run formatter

ea51c04

Add sharded model index files to allowlist

454bd12

Fix issue introduced by addind index files to framework mapping

2be44cc

JimAllanson mentioned this pull request Jul 15, 2023

SageMaker deployment errors #94

Open

philschmid closed this Apr 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sharded checkpoint support #93

Sharded checkpoint support #93

JimAllanson commented Jul 9, 2023

philschmid commented Nov 17, 2023

Sharded checkpoint support #93

Sharded checkpoint support #93

Conversation

JimAllanson commented Jul 9, 2023

philschmid commented Nov 17, 2023