Skip to content

[PoC] separate decoding from datasets #5105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from

Conversation

pmeier
Copy link
Collaborator

@pmeier pmeier commented Dec 16, 2021

Addresses #5075 (comment).

With this patch samples will be undecoded by default, but can easily decoded:

from torchvision.prototype import datasets

encoded_sample = next(iter(datasets.load("caltech101")))
for key, value in sorted(encoded_sample.items()):
    print(f"{key}: {type(value)}")

print("#" * 80)

decoded_sample = datasets.utils.decode_sample(encoded_sample)
for key, value in sorted(decoded_sample.items()):
    print(f"{key}: {type(value)}")
ann: <class 'torchvision.prototype.datasets.utils._decoder.DecodeableStreamWrapper'>
ann_path: <class 'str'>
image: <class 'torchvision.prototype.datasets.utils._decoder.DecodeableImageStreamWrapper'>
image_path: <class 'str'>
label: <class 'torchvision.prototype.features._label.Label'>
################################################################################
ann_path: <class 'str'>
bounding_box: <class 'torchvision.prototype.features._bounding_box.BoundingBox'>
contour: <class 'torchvision.prototype.features._feature.Feature'>
image: <class 'torchvision.prototype.features._image.Image'>
image_path: <class 'str'>
label: <class 'torchvision.prototype.features._label.Label'>

Of course, decode_sample can be applied through .map

from torchvision.prototype import datasets

dataset = datasets.load(...).map(datasets.utils.decode_sample)

For even more convenience, this also adds a SampleDecoder datapipe, that is a thin wrapper around Mapper applying decode_sample. Although, I generally favor using the class interface, I think this is a case where the functional interface comes in handy, because most users will want to use the default decoders:

from torchvision.prototype import datasets

dataset = datasets.load(...).decode_samples()

@ reviewers: Don't worry about the large diff. I already touched all datasets to see if I missed an edge case in my proposal. That was not the case, so it is sufficient to have a look at torchvision/prototype/datasets/utils/_decoder.py and one implementation for example torchvision/prototype/datasets/_builtin/caltech.py. I did not yet fix the tests, so it is expected that they are failing. I'll only do that if you are otherwise happy with the proposal.

cc @pmeier @bjuncek

@facebook-github-bot
Copy link

facebook-github-bot commented Dec 16, 2021

💊 CI failures summary and remediations

As of commit 1406bd3 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@pmeier
Copy link
Collaborator Author

pmeier commented Dec 22, 2021

After some more discussion, we want away from supplying custom file handles to the user, but rather will return the encoded data as uint8 tensors. See #5075 (comment) for details.

With the current implementation, loading data from a dataset now looks like this:

from torchvision.prototype import datasets

for sample in datasets.load("caltech101").map(datasets.utils.decode_images):
    ...

decode_images is only a thin wrapper around the workhorse decode_sample that sets a default decoder for images. There will be decode_videos in the future, but it will probably need additional parameters compared to decode_images and thus we cannot unify the two.

In the future we can also provide a transform that handles the decoding, so it can simply be used as first transform in an Compose

from torchvision.prototype import datasets, transforms

transform = transforms.Compose(
    transforms.DecodeImages(),
    transforms.Resize(...),
    ...
)

for sample in datasets.load(...).map(transform):
    ...

@pmeier
Copy link
Collaborator Author

pmeier commented Jan 26, 2022

Superseded by #5287

@pmeier pmeier closed this Jan 26, 2022
@pmeier pmeier deleted the datasets/decoding-poc branch January 26, 2022 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants