Skip to content

Conversation

scotts
Copy link
Contributor

@scotts scotts commented Jul 2, 2025

This is obviously not even a draft, but rather the skeleton of what the tutorial will cover. I'd like to defer the actual language of the text to later, and instead focus on:

  1. Are there any examples that are missing?
  2. For the examples we have, should they change in any way?
  3. When we have a single image in the row, it seems to get stretched to the width of the column of text. The resulting image looks bad. Do we have an alternative here?
  4. I don't love the rendering of the bounding boxes - it's super faint when I've looked at it locally. But I'm using the utility functions to generate them, and I don't want to change the hard-coded value (width=3) in there. Ideas?

We'll want to have figures that shows what happens with the bounding boxes - let's also leave that to later.

Copy link

pytorch-bot bot commented Jul 2, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9140

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit f6d1838 with merge base d247de8 (image):

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@NicolasHug
Copy link
Member

Thanks a lot for the PR!

Are there any examples that are missing?

Let's try to add crop and perspective

For the examples we have, should they change in any way?

They LGTM, we can add more examples for rotation and elastic which I think will also address the point below:

When we have a single image in the row, it seems to get stretched to the width of the column of text. The resulting image looks bad. Do we have an alternative here?

There probably is an actual fix for this, but I don't know it from the top of my head, and fighting with matplotlib is a challenge. Although, chatgpt should be able to help now. In any case for both RandomRotation and Elastic, we can just call them 5 times and show 5 different images since these transformations are random?

I don't love the rendering of the bounding boxes - it's super faint when I've looked at it locally. But I'm using the utility functions to generate them, and I don't want to change the hard-coded value (width=3) in there. Ideas?

We can modify the plotting helper as we please, it's something I wrote specifically for the tutorial. We can just add a new draw_bbox_kwargs parameter or something like this, while keeping the existing default?

],
format="CXCYWHR",
canvas_size=(orig_img.size[1], orig_img.size[0]),
clamping_mode="hard",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should omit setting clamping_mode here, so that we illustrate the behavior of the boxes with their default clamping_mode: soft.

Comment on lines 47 to 48
# TODO: why is this necessary?
orig_box = v2.ConvertBoundingBoxFormat("xyxyxyxy")(orig_box)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured out why, explained in code comments.

# Clamping Modes
# --------------
# Explain hard and soft, with appropriate links to documentation. Talk about
# defaults. Link to to-be-written-tutorial on mode-setting in general.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can write the tutorial on mode-setting here in this same tutorial? It might make it easier to for users to find all the relevant info in one single place. Obviously this can be done in a separate PR

@NicolasHug NicolasHug merged commit 9024472 into pytorch:main Jul 8, 2025
57 of 61 checks passed
AntoineSimoulin pushed a commit to AntoineSimoulin/vision that referenced this pull request Jul 9, 2025
facebook-github-bot pushed a commit that referenced this pull request Jul 31, 2025
Reviewed By: AntoineSimoulin

Differential Revision: D79175028

fbshipit-source-id: 8ed8fef904151637a5cbedfd76cd2c0e6e4a0b19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants