-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Eliminate runtime cyclic dependencies #6476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a33f1bc
to
f50d0f4
Compare
f50d0f4
to
b8ed25b
Compare
@property | ||
def _F(self) -> ModuleType: | ||
if _Feature.__F is None: | ||
from ..transforms import functional | ||
|
||
_Feature.__F = functional | ||
return _Feature.__F |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lazy import the module the first time we need it. It's shared across all _Feature
subclasses.
Given that there were problems with the vision/test/test_prototype_builtin_datasets.py Lines 104 to 135 in b4b246a
Here is a version to run the same tests for different transforms: import pickle
import pytest
import torch
from torch.utils.data import DataLoader
from torch.utils.data.graph import traverse
from torchdata.datapipes.iter import IterableWrapper
from torchvision.prototype import transforms, features
# TODO: fill me up for more comprehensive tests!
TRANSFORMS = [
transforms.Resize((8, 8)),
]
transforms = pytest.mark.parametrize("transform", TRANSFORMS, ids=lambda transform: type(transform).__name__)
@pytest.mark.parametrize("only_datapipe", [False, True])
@transforms
def test_traversable(transform, only_datapipe):
dp = IterableWrapper([]).map(transform)
traverse(dp, only_datapipe=only_datapipe)
@transforms
def test_serializable(transform):
dp = IterableWrapper([]).map(transform)
pickle.dumps(dp)
def _collate_fn(batch):
return batch
@pytest.mark.parametrize("num_workers", [0, 1])
@transforms
def test_data_loader(transform, num_workers):
dp = IterableWrapper([features.Image(torch.rand(3, 16, 16)) for _ in range(5)], deepcopy=False).map(transform)
dl = DataLoader(
dp,
batch_size=2,
num_workers=num_workers,
collate_fn=_collate_fn,
)
list(dl) So far the PR seems good. |
@pmeier Thanks for the suggestion for additional tests. I know that these tests take a while due to the large number of datasets. Shall I add just one transform for now to enhance them? aka def test_data_loader(self, dataset_mock, config, num_workers):
dataset = dataset_mock.load(config)[0].map(transforms.Resize((8, 8)))
dl = DataLoader(
dataset,
batch_size=2,
num_workers=num_workers,
collate_fn=self._collate_fn,
)
next(iter(dl)) I was thinking of doing this for all 3 tests you quoted above. |
You need at least two transforms transforms.Compose(
[
transforms.DecodeImage(),
transforms.Resize(8, 8),
]
) Otherwise, you will have no images inside the sample. I'm ok with adding these there given that our current transforms compatibility tests are quite "minimal": vision/test/test_prototype_builtin_datasets.py Lines 98 to 102 in b4b246a
@NicolasHug do you have an opinion here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with the changes to the datasets tests minus some small nits. There is test_transformable
which is outside of the scope that GH let's me comment. It will be obsolete after your changes and you can delete it.
Given that only the test_data_loader
actually executes the transform, I don't think we should see a significant slow down. Still, let's wait for CI to complete to compare the timings. @NicolasHug any objections?
@pmeier I'll remove the extra tests to give time to you and @NicolasHug to finalize the strategy. They are not strictly needed for this work. BTW I already see some unrelated failures related to the resize on bboxes that might be worth looking into. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from my paranoia in #6476 (comment), I think this is GTG.
Merging, the failing tests are unrelated and are caused by dependencies issues from core. |
Summary: * Move imports on constructor. * Turn `_F` to a property. * fix linter * Fix mypy * Make it class-wide attribute. * Add tests based on code review * Making changes from code-reviews. * Remove the new tests. * Clean up. * Adding comments. * Update the comment link. Reviewed By: datumbox Differential Revision: D39013658 fbshipit-source-id: 9814c11c44127268b11a55880b9b0c087561deec
A previous approach of assigning
_F
on constructor faced issues on the data loader test as the the unit-test hang. This PR tries an alternative approach of making_F
a property, doing a lazy import and caching the the module reference on the class level. This seems to avoid the previous test issue.