Skip to content

Add MovingMNIST #2690

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Add MovingMNIST #2690

wants to merge 1 commit into from

Conversation

vballoli
Copy link
Contributor

PR in reference to #2676. Currently in draft since the dataset train-test split needs discussion. Inputs regarding how to better handle the train-test arguments is appreciated! @vfdev-5 @pmeier @fmassa

Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the draft ! I left few comments

@@ -435,6 +435,74 @@ def get_int(b: bytes) -> int:
return int(codecs.encode(b, 'hex'), 16)


class MovingMNIST(VisionDataset):
"""MovingMNIST"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please a link (http://www.cs.toronto.edu/~nitish/unsupervised_video/) to the dataset like that

`MNIST <http://yann.lecun.com/exdb/mnist/>`_ Dataset.

and define docstring Args etc as it is done for other datasets.

@@ -435,6 +435,74 @@ def get_int(b: bytes) -> int:
return int(codecs.encode(b, 'hex'), 16)


class MovingMNIST(VisionDataset):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we inherit it from MNIST ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use this as a video dataset, we shouldn't. We probably need to use MNIST to generate the training split though.

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @vballoli,

thanks a lot for this PR. We discussed this internally and have some comments on this:

  1. MovingMNIST is a video dataset, i. e. the individual frames have a strong temporal relation. Currently you are treating it as an image dataset. We should return 4D tensor representing a single video (channels x depth x height x weight) rather than an image with multiple channels (channels x height x weight). This also means that we shouldn't treat the frames individually for the transforms.
  2. You only implemented the test split, since this is the only generated data provided by the author. Since the authors also provide the code to generate the sequences (see for example the implementation by tensorflow), it might be beneficial to also provide a train split. If we do this, we need to decide if we want to generate the sequences ahead-of-time or on-the-fly. Depending on how large the training split will be, the former might be to much on a burden on the memory (even the test split is ~1GB and resides completely in memory).
  3. Although the authors split the sequences of two with 10 frames each, this should not be required since MovingMNIST is marketed as an unsupervised dataset. This is especially important if we generate the trainings data with a deviating number of frames. IMO we should simply return a single video. Maybe we could provide a (static) method that implements this split by the authors.

Let us know what you think and if you are willing to continue working on this.

@@ -435,6 +435,74 @@ def get_int(b: bytes) -> int:
return int(codecs.encode(b, 'hex'), 16)


class MovingMNIST(VisionDataset):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use this as a video dataset, we shouldn't. We probably need to use MNIST to generate the training split though.

self.file)))


def __getitem__(self, index: int) -> Tuple[Any, Any]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without a length, the dataset is not iterable.

Suggested change
def __getitem__(self, index: int) -> Tuple[Any, Any]:
def __len__(self) -> int:
return len(self.data)
def __getitem__(self, index: int) -> Tuple[Any, Any]:

with open(path, 'rb') as f:
x = torch.tensor(np.load(f))
assert(x.dtype == torch.uint8)
return x
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit

Suggested change
return x
return x

@fmassa
Copy link
Member

fmassa commented Sep 22, 2020

To complement @pmeier answer, if you could let the dataset return a single tensor of shape [num_frames, channels, height, width] it would be great

@avijit9
Copy link
Contributor

avijit9 commented Mar 18, 2021

@pmeier Somebody working on closing this PR? If no, I can help.

@pmeier
Copy link
Collaborator

pmeier commented Mar 19, 2021

@avijit9 No, as far as I'm aware, no one is currently working on this. Still, lets wait how #3562 plays out. We are currently not sure which datasets we want to add.

@yassineAlouini
Copy link
Contributor

Thanks @vballoli for this great contribution and sorry for the long time to get a new reply!

Unfortunately, there is a new datasets API in the works here and new datasets should follow the new API design.

That being said, rest assured, there are at least two options:

  • Given that you already put in the work in this PR, a new PR adding this to the new API is more than welcome (don't hesitate to ask for help, for example @pmeier is very knowledgable). As a starting point, you can refer to this guide on how to do that.

  • Someone else could take this over and build on what you have done. You will receive proper credit of course. 👍

Hope one of the two options works for you @vballoli. 👌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants