-
Notifications
You must be signed in to change notification settings - Fork 7.1k
remove decoding from prototype datasets #5287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove decoding from prototype datasets #5287
Conversation
💊 CI failures summary and remediationsAs of commit 36957ff (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stamping! Happy to discuss and review in more details down the road
wnids = tuple(info.extra.wnid_to_category.keys()) | ||
if config.split == "train": | ||
images_root = root / "ILSVRC2012_img_train" | ||
from scipy.io import savemat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While working on the fix for the validation split, I realized that the data setup was slightly wrong.
bndbox = {"xmin": "1", "xmax": "2", "ymin": "3", "ymax": "4"} | ||
def add_size(obj): | ||
obj = add_child(obj, "size") | ||
size = {"width": 0, "height": 0, "depth": 3} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VOC provides the image size together with the annotations. Since the reworked BoundingBox
requires the image size, we need to add it to the mock data.
@@ -1,61 +0,0 @@ | |||
import pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same deal as before. The tests are only partially compatible with the new features in the datasets. Thus, we remove them here and can re-add them when the transforms API is more stable.
) -> Dict[str, Any]: | ||
ann_data, image_data = data | ||
anns, image_meta = ann_data | ||
|
||
sample = self._collate_and_decode_image(image_data, decoder=decoder) | ||
if annotations: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will never get to this point if annotations is None
dependencies=("scipy",), | ||
homepage="http://home.bharathh.info/pubs/codes/SBD/download.html", | ||
valid_options=dict( | ||
split=("train", "val", "train_noval"), | ||
boundaries=(True, False), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we already load the mat file that stores the annotations, there is no need to only return only part of the data.
if config.split == "train_noval": | ||
split_dp = extra_split_dp | ||
split_dp = Filter(split_dp, path_comparator("stem", config.split)) | ||
|
||
split_dp = Filter(split_dp, path_comparator("name", f"{config.split}.txt")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the name is probably more readable than using the stem.
@@ -70,14 +66,6 @@ def read_mat(buffer: io.IOBase, **kwargs: Any) -> Any: | |||
return sio.loadmat(buffer, **kwargs) | |||
|
|||
|
|||
def image_buffer_from_array(array: np.ndarray, *, format: str = "png") -> io.BytesIO: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer needed as it always was a crutch to enable custom decoders for datasets that didn't contain encoded images in the first place.
* revamp prototype features (#5283) * remove decoding from prototype datasets (#5287) * remove decoder from prototype datasets * remove unused imports * cleanup * fix readme * use OneHotLabel in SEMEION * improve voc implementation * revert unrelated changes * fix semeion mock data * fix pcam * readd functional transforms API to prototype (#5295) * readd functional transforms * cleanup * add missing imports * remove __torch_function__ dispatch * readd repr * readd empty line * add test for scriptability * remove function copy * change import from functional tensor transforms to just functional * fix import * fix test * fix prototype features and functional transforms after review (#5377) * fix prototype functional transforms after review * address features review * make mypy more strict on prototype features * make mypy more strict for prototype transforms * fix annotation * fix kernel tests * add automatic feature type dispatch to functional transforms (#5323) * add auto dispatch * fix missing arguments error message * remove pil kernel for erase * automate feature specific parameter detection * fix typos * cleanup dispatcher call * remove __torch_function__ from transform dispatch * remove auto-generation * revert unrelated changes * remove implements decorator * change register parameter order * change order of transforms for readability * add documentation for __torch_function__ * fix mypy * inline check for support * refactor kernel registering process * refactor dispatch to be a regular decorator * split kernels and dispatchers * remove sentinels * replace pass with ... * appease mypy * make single kernel dispatchers more concise * make dispatcher signatures more generic * make kernel checking more strict * revert doc changes * address Franciscos comments * remove inplace * rename kernel test module * fix inplace * remove special casing for pil and vanilla tensors * address comments * update docs * cleanup features / transforms feature branch (#5406) * mark candidates for removal * align signature of resize_bounding_box with corresponding image kernel * fix documentation of Feature * remove interpolation mode and antialias option from resize_segmentation_mask * remove or privatize functionality in features / datasets / transforms
Summary: * revamp prototype features (#5283) * remove decoding from prototype datasets (#5287) * remove decoder from prototype datasets * remove unused imports * cleanup * fix readme * use OneHotLabel in SEMEION * improve voc implementation * revert unrelated changes * fix semeion mock data * fix pcam * readd functional transforms API to prototype (#5295) * readd functional transforms * cleanup * add missing imports * remove __torch_function__ dispatch * readd repr * readd empty line * add test for scriptability * remove function copy * change import from functional tensor transforms to just functional * fix import * fix test * fix prototype features and functional transforms after review (#5377) * fix prototype functional transforms after review * address features review * make mypy more strict on prototype features * make mypy more strict for prototype transforms * fix annotation * fix kernel tests * add automatic feature type dispatch to functional transforms (#5323) * add auto dispatch * fix missing arguments error message * remove pil kernel for erase * automate feature specific parameter detection * fix typos * cleanup dispatcher call * remove __torch_function__ from transform dispatch * remove auto-generation * revert unrelated changes * remove implements decorator * change register parameter order * change order of transforms for readability * add documentation for __torch_function__ * fix mypy * inline check for support * refactor kernel registering process * refactor dispatch to be a regular decorator * split kernels and dispatchers * remove sentinels * replace pass with ... * appease mypy * make single kernel dispatchers more concise * make dispatcher signatures more generic * make kernel checking more strict * revert doc changes * address Franciscos comments * remove inplace * rename kernel test module * fix inplace * remove special casing for pil and vanilla tensors * address comments * update docs * cleanup features / transforms feature branch (#5406) * mark candidates for removal * align signature of resize_bounding_box with corresponding image kernel * fix documentation of Feature * remove interpolation mode and antialias option from resize_segmentation_mask * remove or privatize functionality in features / datasets / transforms Reviewed By: sallysyw Differential Revision: D34265747 fbshipit-source-id: 569ed9f74ac0c026391767c3b422ca0147f55ead
Same deal as #5283. Supersedes #5105.
decoder
todatasets.load
. The decoding will be performed by a transform that will be added later on.decoding
part would need to be dropped from the canonical_collate_and_decode_sample
method. Since "collate" is also used to prepare a batch of data for model consumption, I've opted to change the name to_prepare_sample
.