-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Random colors for drawing boxes #5127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💊 CI failures summary and remediationsAs of commit 056a033 (more details on the Dr. CI page):
1 failure not recognized by patterns:
This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! This does the job! I verified the outputs over test cases. Will post the same here.
Script to test the same. Choice of boxes can be improved. :)
Case 1: - Case 2: - Case 3: - Case 4: - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @oke-aditya for the PR. Given that this was originally discussed with @NicolasHug, we should wait for his review.
In general though, I'm not sure why we want to have a random color palette by default. I fully agree it is better than using only white for everything, but I'm not sure why randomizing the colors is better than fixed ones. Apart from the non-determinism, by generating the colors at random we also risk using very similar colors for different classes, which can be confusing. It is actually not easy to have multiple colors that are easily distinguishable by humans. I suggest we hardcode color maps such as this one. It "only" defines colors for 12 classes. If we need more, we can always supplement with randomly generated afterwards.
The color palette issue is always the concern. I don't exactly recall the conclusion from #2785 #2556 #3296. I agree with your points. A good default palette > random palette > white. It is possible to have one (or maybe more than one) good default color palettes, say supporting upto 12 classes. The problem of colors actually arises when there are many objects of different classes in the same image. I am not sure if there are cases where we have > 12 classes in same image. So I guess it should be okay if two colors are little bit close to each other. The chance of colors being close and having both the classes same is probably a bit low. Colors is kinda hard and I don't have any great solution. |
torchvision/utils.py
Outdated
@@ -169,6 +169,8 @@ def draw_bounding_boxes( | |||
colors (color or list of colors, optional): List containing the colors | |||
of the boxes or single color for all boxes. The color can be represented as | |||
PIL strings e.g. "red" or "#FF00FF", or as RGB tuples e.g. ``(240, 10, 157)``. | |||
By default, random colors are generated for boxes. | |||
If labels are provided, boxes with same labels have same color. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned with this API change because the behaviour changes depending on whether the labels are defined. Effectively the new API tries to do things for the user instead of letting them decide what to do. I think it would be best to stick to generating a random palette for as many masks as we detected. This will be similar to what we do on the segmentation drawing tool and will enable us to reuse methods such as _generate_color_palette()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@datumbox would you mind detailing what you mean by "enable us to reuse methods such as _generate_color_palette()" ?
Having one box color per label is a default behaviour that makes sense to me, and @oke-aditya followed my earlier suggestion of #4528 (comment). It avoids users some extra code.
draw_segmentation_masks()
doesn't accept a label
parameter, so it's not unexpected that its behaviour differs from draw_bounding_boxes()
in that respect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NicolasHug I was thinking that utils._generate_color_palette(num_masks)
can be used to generate a colour per mask (not per label) as currently happens on the segmentation case.
Concerning the lack of support for labels on segmentation, this is indeed the situation with our utility but doesn't have to be the case going forwards. Other libraries such as detectron2 offer visualizations for segmentation and masks that display the labels.
TBH I haven't followed the original discussions (I try to catch up coming back from PTO). My thoughts on this is that if we want to keep the utility simple, we could let the users define one colour per mask. If they need to apply a more complex logic (such as keeping the same colour per label), this should be done outside of the utility using their own custom logic and provide the list of colours to our util. This would align the approach across segmentation and detection. Let me know your thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concerning the lack of support for labels on segmentation, this is indeed the situation with our utility but doesn't have to be the case going forwards. Other libraries such as detectron2 offer visualizations for segmentation and masks that display the labels.
I agree. And if we were to support labels in draw_segmentation_masks()
, I think it would also make sense to have one color per label by default :)
I also agree that complexity is a concern, both regarding the code and regarding what the user would expect from that function. As far as I can tell the only extra code that handling labels adds are these few lines below:
if labels is not None:
label_color_map = dict(zip(labels, colors))
colors = [label_color_map[label] for label in labels]
which I feel is minimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users still can draw colors for their labels. If both labels and colors are passed. The ith
box would get the ith
color, irrespective of the labels.
I guess the current improvement just improves how the the default colors are handled. The need to pass a color palette reduces.
test/test_utils.py
Outdated
@@ -317,5 +325,18 @@ def test_draw_keypoints_errors(): | |||
utils.draw_keypoints(image=img, keypoints=invalid_keypoints) | |||
|
|||
|
|||
def test_random_colors(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some tests so that we are OK with the implementation. I guess these aren't very necessary but anyways good to have and don't hurt much.
test/test_utils.py
Outdated
color_t = utils._generate_random_color() | ||
assert isinstance(color_t, tuple) | ||
assert len(color_t) == 3 | ||
assert 256 not in color_t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checks the overflow so that we don't go beyond 256.
How should we proceed @datumbox @NicolasHug @pmeier ? I'm bit more inclined with Philip's thoughts. While this would make visualizing boxes with same labels a bit easier. A flip side is the color palette is still messy.
Maybe we should have a good default color palette for say around 12 or 15 colors and then opt for random color generation? |
@oke-aditya As I mentioned earlier, I recommend not handling the per-label colouring within the method but instead accepting a list of colours for each box. Having this extra assumption in the code is not worth it as there are many ways to handle colouring for "stuff vs things". For reference I provide how Detectron2 does it, which shows that a) there are many ways to achieve this and b) you need extra information to handle it effectively:
|
I understand your point @datumbox . We make an implicit assumption of colors and labels. Let's not think about This PR also tackles another issue. The default white color palette for all boxes (if no color / color palette is passed) looks bad. I believe if we don't keep the label coloring consistency, we should still consider not using plain white as default for every box. |
@oke-aditya Though strictly speaking BC breaking, I don't think it would be a big problem to replace the whiles with a reasonable randomly generated palette and use the same colour both for boxes and for labels. @NicolasHug any concerns with that? |
I agree we don't need to be too strict with BC regarding the coloring strategy. I'm not sure I fully understand the things vs stuff discussion. It seems to me that assigning one color per label could cover a lot of useful use-cases (more so than assigning random colors), but I don't want to further delay this discussion. |
That is actually quite hard. You would need to make sure that the generated colors have a reasonable "distance" between them. Thus, I would suggest to hardcode a color palette and fallback to best effort random generation, i.e. only enforcing the uniqueness of each color, if the input exceeds the number of hardcoded colors. |
Indeed, it's a hard problem but we already something like that on |
The problem is kind of NP.
Also provide a few color palette options like We can adopt a few palettes from seaborn. Which seem to be popular. I'm not sure of re-using I agree to Phillip and this is something feasible. (Actually here is where the assumption of one color per label could make the stuff easier. It's less likely that there are variety of objects and we would need lot of colors. More often an object will be either repeated or have num boxes < 10. But let's leave it for later discussion.) |
@oke-aditya I might be missing something but aren't you the one who added the first version of
Indeed. We need a small non-complicated utility for people to use to quickly check the model results. There are far better libraries out there for complex visualizations, with detectron2 being one of them. Instead of introducing another palette making method, I would suggest to reuse what we have. Feel free to refactor if necessary. Thankfully it's a private method so there shouldn't be any BC concerns.
Please have a look on the references I provided on #5127 (comment) to see how the colours of "things" and "stuff" are handled by libraries like detectron2. As you will see, they require a lot more meta-data about the labels in order to determine if they should keep the same colour for the same label (stuff) or they should produce a slightly similar colour for each instance (things). Adding this logic in our method, without having access to these meta-data is limiting. Hence my preference not to try to do complex things and keep the behaviour simple. |
I guess we can go ahead with this :) Sorry for the delays. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, there is one more clean up we could do but it's not blocking for me. I'll leave @NicolasHug to review once more and merge if he is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @oke-aditya , I gave it a quick look. It looks good, I think we can try to simplify the code a bit.
� Conflicts: � torchvision/utils.py
I had a brief chat with @datumbox over slack.
I highly agree to the thoughts and felt that it isn't very necessary for one color per class. Although most use-cases might hold it valid, considering all the possibilities it isn't good. Apart from that let's move ahead with this 😃 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well that's some great clean coding 👀 lot to learn from this diff
torchvision/utils.py
Outdated
num_boxes = boxes.shape[0] | ||
|
||
if labels is None: | ||
labels = [None] * num_boxes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to need @pmeier 's enlightment on the mypy issue here https://app.circleci.com/pipelines/github/pytorch/vision/14562/workflows/da348bb7-e7cb-46f5-9d6e-0ea7248d59f1/jobs/1171625
What would you suggest?
I tried
labels: Union[List[str], List[None]] = [None] * num_boxes
But mypy
complains with
torchvision/utils.py:205: error: Name "labels" already defined on line 154 [no-redef]
labels: Union[List[str], List[None]] = [None] * num_boxes
^
despite the allow_redefinition = True
option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very naturally:
labels: Union[List[str], List[None]] = [None] * num_boxes # type: ignore[no-redef]
won't work either because mypy will instead use the first (outdated) definition instead of this new one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
despite the
allow_redefinition = True
option.
This happens because allow_redefinition
only applies to cases where the redefinition is on the same "nesting level".
What would you suggest?
If labels = [None] * ...
is a valid value, I would simply change the input annotation to reflect that: Optional[List[Optional[str]]]
. This also gives the user the option to exclude certain labels (labels=["label1", None, "label2"]
) and should be compliant with the rest of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pmeier for the info.
I'm not sure I understand what the "nesting rule" is meant to enforce. For example this makes mypy happy and yet it's strictly equivalent to our code:
labels: Union[List[str], List[None]] = [None] * num_boxes if labels is None else labels
The only difference is the use of the "ternary" if
instead of an indented if
block.
I'd prefer not allowing None values in the labels
list for users, as I'm not sure there's a use-case where users would have labels for some boxes, but not all (worst cast scenario they can always pass empty strings). In fact, before this PR, passing a None
value for a label would fail as draw.text
doesn't support it.
So I'll just silence mypy here (twice), because this error doesn't make much sense.
Merging, thanks a lot @oke-aditya for the PR! |
Sometimes I feel a bit odd. Maintainers say thanks a lot. But really my contribution < 30%. Thanks to you guys. Great people all around. |
@oke-aditya you suggested this improvement on your own in #4528 and took the time to discuss alternatives / implementations strategies, and got to a good working version here... while doing all of this on your free time. This is very valuable to us and to the project, and definitely accounts for a lot more than 30% :) |
LoC is a very bad metric to judge a contribution. Keep up the good work! |
) Summary: * Add random colors * Update error message, pretty the code * Update edge cases * Change implementation to tuples * Fix bugs * Add tests * Reuse palette * small rename fix * Update tests and code * Simplify code * ufmt * fixed colors -> random colors in docstring * Actually simplify further * Silence mypy. Twice. lol. Reviewed By: NicolasHug Differential Revision: D34140251 fbshipit-source-id: 84685cfb4bfbd1d9d89801f2507cf52e4fff370b Co-authored-by: Nicolas Hug <[email protected]>
Hi @datumbox & @oke-aditya, I am writing to ask about the colouring of bboxes, as it doesn't colour the bboxes per label. Instead, we are colouring per num_boxes instead per num_labels. |
@ElHouas The API was chosen to give to the user maximum freedom to colour things how they want without making assumptions on our side. You can achieve what you need by something along the lines: pallette = ['#be254a', '#dc484c', '#ef6645', '#f88c51', '#fdb365', '#fed27f', '#feeb9d', '#fffebe',
'#f0f9a7', '#d8ef9b', '#b3e0a2', '#89d0a4', '#60bba8', '#3f97b7', '#4273b3']
unique = set(labels)
labels_to_colours = dict(zip(unique, pallette[:len(unique)]))
colours = [labels_to_colours[l] for l in labels] |
@datumbox It makes sense from your side, I appreciate the quick response. |
Closes #4528 Supersedes #4658
Work in progress will detail it soon.Done! awaiting review.
:) Feels so good to be healthy and code again.
cc @pmeier (I guess Nicolas will return next week)